SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : C-Cube
CUBE 36.77-0.5%2:22 PM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: JEFF K who wrote (37128)11/6/1998 5:46:00 PM
From: John Rieman  Read Replies (1) of 50808
 
C-Cube is not a Chip company or a Communications company, they are Artists............................................
(How old does art have to be to have value?).......................

digitaltelevision.com

The Great Compression Debate, Part 3:
MPEG-2 Art
Splicing, Editing and Keying
By Michael Silbergleid

Television, as we know it, has changed. Then again, television has always changed. From black and white to color, mono to stereo to multichannel, composite to component, analog to digital, baseband to compressed.

Perhaps the biggest changes are still to come--NTSC to ATSC and PAL to DVB. These changes mean a new way of working with video and audio in MPEG-2 compression, which is the specification for both transmission standards. This has led to a desire by broadcasters and broadcast equipment manufacturers on both sides of the Atlantic (as well as both sides of the Pacific) to remain either in baseband as long as possible before compression, or once compressed to stay compressed.

The reasons for this are simple: When you compress a signal, you are throwing away part of that signal--something you might need later, and compression encoding and decoding cycles can have an unwanted affect on the quality of the image and sound.

So ever since manufacturers, broadcasters and regulatory bodies decided that MPEG-2 would be a good idea for one part of the broadcast chain (transmission), the industry has been trying to adopt MPEG-2 to all of the other parts of the chain--production, post, and storage so that once in MPEG-2, you could stay that way throughout the chain.

Unfortunately, we tend to want to do things to our video (like dissolves, wipes and keys) which make life in the MPEG-2 world rather difficult. The consensus has been that you couldn't switch, splice, edit, key or do effects in the MPEG-2 realm accurately. That isn't, and has never been altogether true.

What MPEG-2 Is And Isn't

MPEG-2 is just a set of compression tools. It is not a tape format or a transmission format. For production and post, we use the MPEG-2 4:2:2 toolkit. For transmission, we use the MPEG-2 4:2:0 toolkit. While this is an ultra-simplistic way of looking at MPEG-2, it is accurate.

Keep in mind that there are three types of MPEG-2 frames and that you can use different combinations of these frames during compression:

I frame: Intraframe. This frame contains all the information needed to construct a complete frame of video. It is exactly like a M-JPEG or DV frame in that it contains all the information needed to construct a single video frame. Video that is encoded using only I frames can be switched and edited (cuts only) without having to decompress the signal. Provides the least encoding efficiencies since the complete frame is encoded.

P frame: Predictive. The frame is encoded using motion-compensation prediction from a past reference picture. Contains the difference between the actual image and the predicted image. Provides better encoding efficiencies than I frames.

B frame: Bi-directional predictive. The frame is encoded using motion-compensation prediction from both a past and future reference picture. Contains the difference between the actual image and the predicted image--it is not an average of the previous and future frames. Provides the best encoding efficiencies, however, a third memory buffer is required in addition to the buffers for past and future reference (I and P) pictures. Fast search modes ignore B frames.

MPEG-2 streams can be encoded using I frames only, I and P frames, I and B frames, or I, P and B frames. I frames are always needed and are the first frame in a Group of Pictures (GoP), which then ends just before the next I frame. GoPs can have a length of one (all I frames) or more. The larger the GoP, the more efficient the compression encoding, but the harder it is to manipulate and survive errors in transmission (since an I frame is needed as a reference).

So, MPEG-2 causes problems because:

--If you try to manipulate anything other than an I frame, it's like trying to edit or switch on a frame that doesn't exist...because it doesn't.

--You can't manipulate something, like an MPEG-2 frame, that really isn't video.

Unfortunately, television has progressed from its early days when camera lens turret moves would be seen on the air and the viewing public accepted them. (If you've never seen a lens turret move, imagine seeing a wide shot, your picture going blurry then black for a second or two and then seeing a close-up). Today, we want everything to be seamless and perfect. Unfortunately, MPEG-2 makes that rather difficult. Here's how:

Splicing

You want to switch between two MPEG-2 steams. To do this perfectly, the streams need to be synchronized. To do it when you push the "take" button, they also must have an I-frame pass by the switch at the right moment. With I frame only this is simple, but imagine having a GoP where I frames come every half a second or more.

There are some ways around this. The NDS System 3000 MPEG Splicer/Transcoder can switch between three MPEG-2 streams. The catch is that right now (actually Q1 1999) you can only splice in non-seamless and near-seamless modes. NDS is waiting for SMPTE to finalize the seamless standard. But there are some problems with this technology. During a non-seamless splice, the viewer would see either a blank or frozen frame generated by the viewer's decoder for anywhere from 3 to 26 frames depending on the splice point and when the next I frame appears.

For near-seamless splicing, there must be control over one of the sources according to Mike Knowles, manager of business development-digital terrestrial for NDS. Therefore that source must be local and on disk. With near-seamless splicing, the delay happens at the station. When the "take" button is pressed, the splice doesn't occur until after the last frame in the GoP, so the next I frame comes from the local material. The latency for near-seamless splicing is dependent on the size of the GoP from the outside source and where in the GoP the "take" button is pressed.

The Philips DVS 4800 StreamCutter currently provides seamless splicing, but in a slightly different way. The StreamCutter supports local insertion (remember the need to have control of at least one local source) and switching between a single high definition source and multiple standard definition programs for ATSC. For DVB, the StreamCutter handles up to five program streams.

While StreamCutter does not rely on the existence of splicing points (a marker for clean splices within the MPEG-2 bitstream), it can utilize them when they are present (and available) for its Seamless Splicing Point Mode, where the main and local streams must contain compatible and aligned splicing points.

The Seamless Uniform Color Pictures (UCP) mode is somewhat visually similar to NDS' non-seamless splice except that a uniform color (or black) is sent by the StreamCutter and seen by the viewer instead of a decoder generated frozen picture or black. The UCP period is fixed and is less than half a second. But there is a slight delay as the system waits for the end of the GoP before sending the color.

One of the best features of the StreamCutter is that it can simultaneously switch four bitstreams each containing a single video component, or a single bitstream containing up to four video components. With StreamCutter, if you're already multiplexed for four channels, you can replace any one of those channels without having to demux.

StreamCutter supports DVB-compliant streams up to at least 80 Mbps as well as 19.4 Mbps ATSC streams. All interfaces are DVB-ASI.

Hewlett-Packard uses a buffering techniques within their MPEG-2 MediaStream family of products for MPEG switching. HP's "CleanCut" technology provides seamless cuts-only capability between bitstreams coming off of the server as video. This may seem fairly simple, but is actually very complex due to timing issues. What happens is that the server 'pre-charges' the 'cut-to' decoder with the complete GoP so that a splice can be made on any frame within the video domain. Sort of like an MPEG-2 frame store and synchronizer.

According to Al Kovalick, H-P's principal architect for video servers, plans are to have a server-based system sometime in 1999 that will be eventually capable of logo insertions, cuts, fades, SMPTE wipes, and any linear effect. By being based in a server, the system will be able to have the time to do the manipulations. According to Kovalick, how you approach manipulating MPEG-2 video should be based on whether you are in realtime or non-realtime.

"In the realtime world, it's too early to say which is the right method [baseband or MPEG-2]," says Kovalick, "but in the non-realtime world, baseband is wrong. If you have pre-knowledge you can do things differently." But Kovalick warns, there will always be the question of price versus performance.

Not everyone agrees with Kovalick that in non-realtime baseband is wrong. C-Cube Microsystems has a lot to gain by folks wanting to decode then re-encode--since they make the chips that do the encoding and decoding. Their new DVxpress-MX chip may be the world's first all-digital, mixed format compression chip for MPEG-2 (4:2:2 and 4:2:0) and DV25 (as well as another model that adds DV50) (4:2:2, 4:2:0 and 4:1:1), but in order to do the conversion, the single chip converts the compressed video to digital baseband--with a transcoding latency of seven to eight frames. Dr. Feng-Ming Wang, C-Cube's general manager for the PC/Codec division believes that "the MPEG domain is always limited" and that baseband provides the "least possible problems." Dr. Wang says that the biggest problem is maintaining a compliant MPEG bitstream after the signal is modified without being decoded. "Baseband," he says, "is easy."

Baseband may be easy, but most experts agree that it is not the most efficient way of dealing with the need to manipulate video encoded as MPEG-2.

Other solutions, like those under development from Lucent Technologies and Sony involve not just locating the I frame for splicing, but in forcing the creation of an I frame through transcoding so that a splice can take place at any time.

Editing

How do you edit on a frame that doesn't exist? For the FAST 601 (six-o-one) MPEG-2 nonlinear editor the answer is easy... all frames exist. FAST's 601 encodes everything with I frames so each frame can stand on its own. If you want to do a cut, you can do it in the MPEG-2 realm and not loose quality by decompressing and recompressing. Of course if you want to do anything else (dissolve, key, wipe, etc.) 601 will still decode to baseband, manipulate the video, and re-encode the result to MPEG-2.

Sony's Betacam SX format works very differently. BetaSX uses I and B frames for efficient compression during recording. But for editing, Sony decodes back to baseband video. Since the frames that the B frame is using for bi-directional prediction are already recorded on the BetaSX tape, a memory buffer lets the B frame see ahead before conversion to baseband.

Sony's hybrid DNW-A100 Beta SX DVTR does MPEG cuts only editing without going to baseband since the file is being read off the disk and being manipulated by a playout list.

Keying (and Effects)

On-screen logos are everywhere. Stations and networks branding themselves in the eyes of the viewer. But what if the signal coming into your plant is already compressed for transmission, like the recent Harris ATSC pre-compressed feed of the shuttle Discovery launch of John Glenn? You might think that you would have to decode all of the picture. While that may have been the popular theory, technology is advancing to the point where you can do your keying without having to decode at all.

Today, if you want to do a key on compressed video, you no longer have to decode the entire frame. Pro-Bel offers a logo inserter that enables logos and IDs that are stored as bitmapped files to be directly inserted into an MPEG-2 bitsream. By using a new transport stream manipulation technology developed by the Institut für Rundfunktechnik (IRT), the research and development body of the public broadcasting authorities of Germany, Austria and Switzerland, the bitstream is only decoded in the area required for graphic insertion on a macroblock level, so that the remainder of the picture passes virtually unaltered. Both transparent and opaque graphics can be faded in and out using pre-set rates at user defined screen locations.

While not just for keying, Snell & Wilcox is using MOLE technology, as developed by the ATLANTIC project (BBC (UK), Centro Studi e Laboratori telecomunicazione (Italy), Ecole Nationale Superieure des Telecommunications (France), Ecole Polytechnique Fédérale de Lausanne (Switzerland), Electrocraft (UK), Fraunhofer-Institut für Integrierte Schaltungen (Germany), Instituto de Engenharia de Sistemas e Computadores (Portugal), Snell & Wilcox (UK)) to keep concatenation errors to a minimum. MOLE decoders forward the previous encoding decisions at the macroblock level invisibly through the serial digital signal path, so that subsequent encoders can 'clone' the original encoding process, except where the picture has changed (wipes, dissolves and keys).

Thomson has what they call the "Helper Channel." Similar to MOLE technology, the helper channel has decoder-generated metadata placed in the vertical interval of a digital signal derived from a Helper Channel equipped MPEG-2 decoder and used by a Helper Channel equipped encoder to re-encode the signal after manipulation. While slightly less efficient (and less expensive) then MOLE technology, the Helper Channel is another way to get around the problem of having to decode the entire frame--and it's available now, while MOLE is awaiting SMPTE standardization.

Philips, however, is the technological leader in MPEG-2 keying and picture manipulation with their MPEG Object (Logo) Insertion system. At a demonstration at IBC, hidden in a section of their booth, Philips' Vincent Boutroux of the digital signal processing group in France showed what he could do with a business card.

First he would scan in the logo off a business card, then he would place that logo in the picture. The difference between other systems and the Philips system is that Vincent did not decode the background video to baseband... not one macroblock or pixel was decoded. The system works by using transcoding technology and DCT coefficients.

Images can be any size and shape but are currently displayed transparent or up to about 80 percent opacity and their video levels follow the background video.

While there is some image flashing as the system learns the MPEG-2 sequence of the background video, the system does indeed work.

In fact, logos can be made to fly in and out if so desired, or a ticker can be placed on the screen. All without conversion to baseband.

One Other Small Problem

Due to the latency between the encoder at the broadcast facility and the decoder at the viewing point, techniques that we take for granted in analog will have to find other solutions. Two of the most prominent are time and cueing.

If you give out a time tone at the top of each hour, you can compensate for your station's encoder latency, but there is no way to compensate for the viewer's decoder latency (and each brand will have a different latency). To the viewer, your top of the hour chime will be close, but not exact.

For news organizations that use off-air cueing for live remote reports, the latency at the reporter's TV set will be too large for the signal to be useful. For this task, private radio frequencies (which will have to be coordinated since everyone will want some) with IFBs are one solution while reporters learn not to watch themselves on air.
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext