Death-By-Bitrate
Proudly
Presents
Note: the bouncing ball graphics on the page were swiped without permission from Chris Pirazzi's page. He presents the the same basic information but with a different focus. Still, it's worth a read if you're at all interested int this stuff.
This is probably going to sound a bit like a rant so let me just preface this with a bit of information. There is a lot of confusion about the need to deinterlace SVCD video clips. I’m sick of getting asked about it. Virtually every guide that I read tells you to deinterlace your source material before encoding. This is just plain wrong. It also shows that a lot of people really don’t understand the MPEG-2 format or the NTSC playback methods. That being said…
Understanding FILM
Okay, let’s start with the easy one. As most people know, movies that are show in theatres are displayed on big reels of film. Each second of film on these reels is made up of 24 complete pictures. Pretty simple, right? Okay, let’s get a bit more advanced. Each one of these pictures is called a frame. In this case, they’re also 24 distinct fields, too. What’s a field? Read on…
Understanding NTSC
NTSC is a method of displaying pictures at 60hz; that is, 60 pictures every second. These pictures are called fields. Let’s say you shoot a picture of a ball bouncing by. Here’s 10 frames from this sequence. The problem is, if you record this with standard video equipment you won’t end up with this.
Instead, you end up with something like this. Each frame only gets every other line. Frame 0 has all the even lines, and frame 1 has all the odd lines. This pattern continues forever until the end of the clip.
"So why can’t I just put them back together? Shouldn’t I end up with a nice solid sequence like the first example?"
No. If you put them "back together" you get something like this:
See the problem? Half of the frames are missing. There’s no nice way to reconstruct them.
"I know! I’ll deinterlace them!"
*SMACK* Don’t be silly. Deinterlacing will leave you with one of two results. If you just double all the lines, then your file will look all jagged. Like this:
"Fine. I’ll use a blending filter!"
Great. So now you’ve got a bunch of blurry frames:
Okay, enough of that. Let’s get back to understanding how this fits into NTSC MPEG-2 encoding. Because NTSC alternates between odd and even lines during playback, it makes sense to combine these two fields together to create a single frame out of them. So while we’re still displaying 60 fields per second (as required by NTSC) we’re actually displaying them from 30 frames per second. For some murky technical reasons, this ends up being encoded as 29.97 frames per second in MPEG-2.
Still with me? Okay, then let’s get on to the fun part:
Telecine or 3:2 Pulldown
In order to watch 24fps movies on our 30fps NTSC TVs a conversion must be performed. This method is called Telecining (televised cinema) or 3:2 pulldown. The process is fairly straightforward, but can be a bit weird to explain. Let’s start out with the math:
24fps / 6 = a sequence of 4 frames.
We need to get to 30 fps, so…
30fps / 6 = a sequence of 5 frames.
Okay, so to go from 24fps to 30fps, we need to come up with an extra frame every 4 frames. There are a number of possibilities here, but the movie industry set on the Telecine standard. Here’s what happens. Certain fields are repeated a couple times so that we get an extra frame out of this 5 frame sequence. There are a number of nasty method of doing this, but it is most simply explained the following way:
You’ve got these 4 frames from the original movie:
A B C D
Now we split the odd and even lines apart so that we can make 2 fields out of each frame…but there’s a twist. Frames B and D are going to get repeated one more time than the others.
A A B B B C C D D D
Now we take each of the resulting 2 fields and combine them into single frames.
AA BB BC CD DD
So what we’re left with are 5 frames. 3 of which (AA, BB & DD) are exactly the same as the original source, but 2 (BC & CD) are mangled versions of the master "C" frame. This works out fine on our TVs but can make dealing with certain video sources rather interesting. This is just one way of handling pulldown. Although the basic concept is the same, many companies will vary the order that they do it in. Also, it’s not uncommon to find that some movies will have HALF the field in one frame and half in another…
Do that 6 more times, and you’ve got your 30 frames-per-second-NTSC-happy video clip. If you didn’t understand the above then read it again. Keep on reading it until in sinks in. It took me a few times to get it, too.
Understanding MPEG-2 (DVD) encoding methods
There are many different kinds of video sources that you can encounter when encoding from DVD. Most of them need to be treated differently to get the best possible results. Let’s go over each of them.
Standard NTSC broadcast television
Examples of this would be pretty much any live action TV series on the networks in North America. We’re talking about 29.97fps interlaced video with a 4:3 (your TV screen) aspect ratio. If you load one of these in FlasK, you’ll see that FlasK reports a fps of 29.97 and an interlaced format.
Anime (or CG) NTSC broadcast television
Ah, the good stuff. Many people will tell you that 3:2 pulldown is only found in movies that have been converted to television. This just ain’t true, folks. Since most anime is hand drawn, studios tend to try and keep the number of frames per second down to a minimum. So even though this stuff is destined for the 30fps NTSC TV screen, it’s all drawn at 24fps. Since anime starts its life as individual cells like a movie reel, it’s transferred to TV using the exact same telecine method as films. This will show up in FlasK as one of the two following types of encodes:
Hard Telecined Encode
FlasK will report 29.97 fps and will stay steady at "Interlaced" during playback. You’ll see this on series that have been shot from a telecined video source. The telecining was done ahead of time on the original analog video masters before the digitizing process began. These come in both 16:9 and 4:3 aspect ratios.
Soft Telecined Encode
FlasK will report 29.97 fps but detect a fps of 23.976. During playback the encoding method will alternate between "Progressive" and "Interlaced." In this case, the original material was mastered from 24fps progressive material. During the encode, some flags were set in the MPEG-2 encoder to make it store 23.976 fps internally, but to actually play back in telecined format. These come in both 16:9 and 4:3 aspect ratios.
Pure Film Encode
This case is similar to the above, except that FlasK will stay rock-solid steady on "Progressive." There aren’t a lot of these kinds of movies out there, but it’s nice when you find one. >:> These movies are encoded and stored in 23.976 fps format. They have a flag set to invoke playback at 29.97fps, but they rely on the DVD player software to perform the required telecining on-the-fly. Essentially, the MPEG-2 just contains the original 24 frames. These come in both 16:9 and 4:3 aspect ratios.
"Okay, I’ve read all that. Now how do I de-interlace my source file?"
Aaaaah!!! You don’t! You really don’t need to deinterlace your source files if your target file has the same number of horizontal lines. Period. End of story. Look at the same "deinterlaced" images once again. They look worse, not better. But there is something that you can do to get better quality out of a lot of these streams. It’s not interlacing, mind you. It’s called…
Inverse Telecine or (IVTC)
Okay, we know what Telecine/3:2 pulldown is, now. We know that we can just encode our target file as an interlaced stream and everything will turn out okay. But we can get a lot nicer quality if we can undo the pulldown first and then re-do it in our own encode. Let me explain a bit more.
It all comes down to bitrate. Bitrate is the standard unit-of-measure for setting quality in an encoded MPEG. For SVCD we have a max bitrate of 2600kbits, but for most encodes, our average bitrate will be somewhere around 1500-1700kbits. What most people don’t realize is that bitrate is applied per second not per frame. This means that a 23.976fps 1 second clip encoded at 1700kbits will have 6 fewer frames to allocate bits to than a 29.97fps 1 second clip encoded at the same bitrate. More simply:
23.976fps @ 1700kbits = 70904 bits per frame, while
29.97fps @ 1700kbits = 56723 bits per frame.
To put this in perspective, let’s say that you encode a movie at a bitrate of 1700kbits @ 23.976fps. If you encode the exact same movie at 29.97fps, you’d have to use a bitrate of 2125kbits in order to get the same kind of quality as the 23.976fps encode. That’s an extra 425000 extra bits! Think of how much more quality you can pack into your 23.976 encode!
"Okay, but I thought you said that I can only play back my SVCD at 29.97fps?"
Right. But you can still encode your SVCD at 23.976fps. All you need to do is set the 3:2 pulldown flag in your encoder. This way, your video is still stored at 23.976fps goodness, but it has special flags embedded into it force playback at 29.97fps.
Another advantage of converting back to progressive frames is that it makes resizing much easier. Even if you’re still encoding at 480 lines, you’ll need to resize if you’re converting a 16:9 anamorphic encode to the 4:3 aspect ratio required by SVCD players. If you try to do this without undoing the pulldown, you’ll end up with a very badly-interlaced mess.
"Great! I’m thoroughly convinced! I want to do an IVTC on all my encodes from now on…"
Whoa. It ain’t so easy. If it was, everyone would be doing it by now. The problem is, undoing 3:2 pulldown can often be an adventure in futility. Lots of programs have IVTC functions (Adobe After Effects, VirtulDub, AVISynth, TMPGEnc, etc.) but the problem is that none of them work really well. It’s less a problem with their built-in algorithims, but a problem with the myriad of telecine schemes out there. There are a lot of different ways of applying 3:2 pulldown patterns. Most of the time, you end up with something that’s almost right, but not quite. What’s really needed is a program that lets you take some control over the process… And then TMPGEnc v12 came out. >:> The IVTC function in this program is worth its weight in gold. It’s worked for virtually every piece of anime and most movies that I’ve thrown at it. In fact, I have only found once clip so far that I couldn’t undo with it. Having said that, the IVTC process still leaves a bit to be desired. It’s very slow and it requires you to convert your source file into an AVI. So you’ll need many extra gigs floating around on your HD if you want to do this. Still, for some of us, the quality is worth it. We’ll talk about the details of this later. For now, let’s go over the above source format again and see what should be done to get a nice SVCD encode from each of them.
Standard NTSC TV
Nothing! Just encode this file as an interlaced 4:3 SVCD stream. You’ll end up with an MPEG-2 that’s almost as good as the original.
Anime NTSC TV
In most cases, you can just encode as a standard interlaced 4:3 SVCD stream. If you want to spend more time on it for more quality, you can probably manage to perform an IVTC on the file. If you have a very fast PC and lots of extra HD space, this is probably worth doing.
16:9 Anamorphic movies
Okay, folks. IVTC isn’t an option here, it’s a requirement. In order to get these suckers to play nicely in the 4:3 aspect ratio that SVCD requires, we’re gonna have to resize – if we don’t, we’ll end up with a half-blended interlaced mess that would look worse than a bad VCD encode. That means, that we either (shudder) de-interlace the file first, or we perform an IVTC on it so that we can resize the progressive pictures properly. If FlasK reports that the file is totally progressive, then you can just set the framerate to 23.976 in FlasK and check "reconstruct progressive frames" and FlasK will do the IVTC for you. This will save a hell of a lot of time. If FlasK reports both progressive and interlaced frames then you’ll have to stream out of FlasK at 29.97fps and do the IVTC in TMPGEnc. This will take a while. Fortunately, it will take a lot less time than your 2-pass encode is likely to…
Sometimes you’ll end up with a 16:9 that cannot be IVTC’d properly. In this case, you’re pretty much outta luck. You can’t really deinterlace these streams. If you’re a real trooper, you could probably tweak it out with AVISynth’s PeculiarBlend() option. You may even want to try FlasK’s deinterlacing option and see how it goes. So far, out of my 12 sample clips, I only have one that refuses to IVTC, so for the most part these should be the only clips that you should ever consider deinterlacting.
Bottom line on deinterlacing and SVCD:
Don’t. From the above, it should be obvious that there’s really no point. It makes quality worse, not better. Most deinterlacing algorithms will make high motion and pans look jerky. If you want to get rid of interlacing artifacts and improve the quality, then invest the time and do a proper 3:2 pulldown. If your deinterlaced encodes look better than your interlaced ones, then you’re not doing it right. Check that you’re encoding at 29.97fps and that you’ve got the field order correct. Deinterlacing is a good thing for VCD and DiVX ;-), but not for MPEG-2 SVCD or DVDs.
Special thanks to:
Everyone that keeps insisting that interlacing is a good idea. You gave me the motivation to write this. >>
A Death-By-Bitrate article.
- Inwards (August 19, 2000)