Jump to content

H.264, MPEG-4


Paul Bruening

Recommended Posts

  • Premium Member

The Wikipedia article on this is bending my brain. Could someone dumb it down for me? Is this a suitable format in 1080p to originate HD video on? What's it more like that I know about in a tape format? Strengths? Weaknesses?

Link to comment
Share on other sites

h.264 is wavelet compression. Wavelet compression works under the assumption that any picture is basically [a half-res version of that picture] + [the details that make up the difference between the half-res and full-res version]. So a 1920x1080 image gets broken down into a [960x540 image] + [detail], and the half-res gets broken down into [a 480x270 image] + [detail], etc. This is done for each frame, and it's very processor-intensive to encode and decode, but the end result is that all you're storing is a very tiny image file, and all of the successive layers of detail.

All of this has several benefits:

1) you can get very low file sizes while still maintaining good image quality, which is why h.264 is a popular codec for internet delivery.

2) If you're clever, you can decode only partially and still view the image [albeit at lower quality]. This is what Cineform does, and RED does a somewhat modified version of it. But it's pretty cool because you can use your full-res file as its own proxy [at least with Cineform, h.264 can't do this], and do your offline edit with all of the same files. This works because it's not adding every level of detail back on, so you're basically looking at a half- or quarter-res version of the original file, and it's a light enough decode that you can do it in realtime.

3) Because a lot of the detail that's being preserved at the top level is noise or grain, if you start throwing out some of that detail data, you actually have the effect of denoising the image.

 

But unlike RED and Cineform, h.264 is a delivery codec. It is lossy and was never intended as an aquisition format. I suppose that at very low compression levels, it might be workable, but it's basically a really fancy version of miniDV or HDV codecs- meant to fit a good quality image into a very small file size, but not what you want to be using when you really care about your image.

 

Is that helpful?

Link to comment
Share on other sites

  • Premium Member

Scott, I think you might be confusing h.264 with JPEG-2000. The latter is a wavelet compressor, the former isn't. h.264 uses some DCT-like techniques, like a normal JPEG or DV frame, to compress raw bitmap image data, among a very wide selection of other techniques.

 

Both cineform and red are extremely lossy (red is so compressed I'd barely call it 2K let alone 4, regardless of what the camera is capable of). h.264 is capable of working in an I-frame only mode, as in AVC Intra. Whether these techniques are comparable or not depends on whether you consider DCT and wavelet comparable. There is a school of thought which takes into account the fact that DCT produces blocking in predictable locations which can be intelligently removed, whereas wavelets produce softness, so looks better without any postprocessing. However, wavelet also produces very severe aliasing (at those half-resolution steps you were talking about) which cannot be predicted or intelligently removed.

 

Wavelet is not a panacea.

 

P

Link to comment
Share on other sites

Scott, I think you might be confusing h.264 with JPEG-2000

 

Though, H.264 does not allow for a direct usage of any wavelets, IIRC, however, H.264 is a special flavor of MPEG-4 (Part 10. to be precise) and MPEG-4 in general allows wavelet types in addition to DCT-types, if my memory serves me right.

 

Both cineform and red are extremely lossy (red is so compressed I'd barely call it 2K let alone 4, regardless of what the camera is capable of).

 

Both Cineform and Red are actually mildly compressed. I think both are of the order of (1:10), which is just mild, and that is one reason the output for them looks good because in the first place they are not a whole lot compressed. Extremely compression would be order of 1:500, that is done for many realtime applications.

 

h.264 is capable of working in an I-frame only mode, as in AVC Intra. Whether these techniques are comparable or not depends on whether you consider DCT and wavelet comparable.

 

Ture. However, unlike JPEG and MPEG2, IIRC, H.264 allows for prediction within the intra frames also, so instead of having prediction among microblocks from neighboring frames, you can have prediction from microblocks from the same frame.

 

Wavelet is not a panacea.

 

Of course not. The best compression for a single image using linear methods would result from using SVD (singular value decomposition), and, after that, using KLT (Karhunen Lòeve Transform). As you mentioned, an advantage of wavelets is that there are no blocking artifacts.

 

Additionally, there is another flavor of wavelets, called Wavelet Packets, that is a little different from traditional wavelet configuration, that is sometimes also used in compression.

Edited by DJ Joofa
Link to comment
Share on other sites

Just thought I might ask, while we are on 264 topic.

Is it only me, or does anyone else also see a distinct difference in contrast/gamma after H.264 compression in FCP?

It seems to me that image becomes kid of "milkier" and less contrasty.

Is H.264 supposed to be the best codec for internet streaming (vimeo uploading, for example)?

Link to comment
Share on other sites

Is H.264 supposed to be the best codec for internet streaming (vimeo uploading, for example)?

 

In many ways, yes. H.264 was designed for low bitrate streaming applications. According to several studies, H.264 roughly requires about half the bitrate for the same quality as MPEG-2. Though H.264 encoder can get very complex, the good news is that because of the asymmetric encoding/decoding process inherent in MPEG type schemes, the decoder (for a particular profile/level) is relatively less complex, and should be fast.

Link to comment
Share on other sites

  • Premium Member

Hey DJ,

 

Thanks for your answers. I have another: Youtube puts a 20GB cap on uploads. Can an H.264, 720P, 30fps, 2hr runtime feature fit into that 20GB? If not, how many minutes will fit using the aforementioned numbers?

Link to comment
Share on other sites

Hey DJ,

 

Thanks for your answers. I have another: Youtube puts a 20GB cap on uploads. Can an H.264, 720P, 30fps, 2hr runtime feature fit into that 20GB? If not, how many minutes will fit using the aforementioned numbers?

 

Hi Paul,

 

20 GBytes = 163840 Mbits, and 2 hours = 7200 seconds. Hence, max. bitrate is 163840/7200 = 22.76 Mbits/seconds, which is a reasonably good bitrate for 1280x720@30fps.

 

For comparison:

 

** Red camera compresses to 1.2 bits/sample.

** DVD at its best quality for video only (~8Mbits/sec) is about 0.51 bits/sample on the 4:2:0 MPEG-2 data.

** 1280x720@30 fps comes out to be 0.55 bits/sample on 4:2:0 data.

 

Going by just data rate you should be fine compressing a 2-hour movie in H.264 in 20Gbytes. However, in practise one would have to figure out which H.264 profile/level the above data rate fits into, and if in case one needs to drop to a lower level or perhaps profile, then do accordingly.

Link to comment
Share on other sites

Can anyone explain to me why H264 is so terrible as an editing codec in an NLE?

 

It's something to do with each frame not really being a true progressive frame? Or not being "frame accurate" or something like that?

Link to comment
Share on other sites

  • Premium Member

H.264, along with many (even most) other codecs, is capable of exploiting the fact that most frames have a lot of similarity with those around them. In these cases, frames may be encoded as a reference block of a previous or upcoming frame, along with a vector describing where that block should be moved to in order to approximate the new frame.

 

This implies that to reconstruct any given frame, it may be necessary to return to the nearest complete frames before and after the target, in order to recover image data which may be part of the new frame. Doing so isn't necessarily a problem for editing; most modern NLEs will deal with it automatically, but it does impose a processing load which may reduce the interactive performance of the software.

 

As to the much earlier comment that Red is "lightly compressed" - christ, the creative accounting that's going on here is ridiculous. 10:1 is far from mild. 3:1 MJPEG was considered harsh for many broadcast applications, and while that to some extent is a sign of changing times, it lavishly illustrates the fact that 10:1 is heavy. DV is about 5:1, and not considered good enough for high end work. Even if you believe that wavelets are twice as good as the better-than-JPEG DCT used in DV, which they aren't, Red really is the DVCAM of high definition. And that's being nice to it.

 

If you want to be somewhat less nice to it - and lets' face it, we do - take into account the fact that the image is already intrinsically compressed by the subsampling inherent to the bayer array. You're only compressing one plane's worth of data out of three you will eventually reconstruct, so it's hard not to arrive at a figure of 30:1 including both subsampling and wavelets. HDCAM was harshly criticised for its heavy subsampling and resolution-reduction, but I suppose Graeme Nattress has decided that it's now OK, so that's all right then...

 

You can try - and fail, I think - to overlook bayer loss, and claim it's 4K. You could also try to overlook bayer loss and call it 10:1 compression. Even if you accept either of these, you certainly can't have it both ways.

 

P

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...