NTSC, PAL & Interlace Explained
The Motion Picture Camera &
Cinema
Motion picture cameras are based on photographic film
just like your everyday hand held photographic camera. Hollywood
movies use 35mm film but professional camera men often use 16mm and
the home enthusiast will usually be content with 8mm. To record a
movie, motion picture film is spun around a big reel inside a camera
and exposed 24 times a second. As a result it will capture 24
photographs, or what we call 24 "frames" every second (fps). Each
frame is one complete photograph, it is not digitally stored or
compressed - you could almost literally cut each one out and stick
it in your family photo album if you wanted! Once the movie is made,
the film is developed, placed onto a projector, and projected onto
the cinema screen.
Resolution
In terms of resolution its not really possible to
compare a 35mm film to a VHS or VGA resolution because, like any
photographic film, its resolution is based on a myriad tiny light
sensitive crystals embedded into the film. When these are struck by
light they change colour to match the light that has hit them
producing a photo. But a 35mm film, based on average crystal size
would be about 5000 x 5000 pixels. This is also the resolution
Photoshop artists such as Craig Mullins use to create movie
backdrops for the cinema. Nevertheless, the human eye can barely see
the equivalent of 3000 x 3000 pixels of such a small area. So when a
35mm movie is scanned into a computer to try and get its full
resolution for digital editing, it will be scanned in at 4096
horizontal pixels, also known as 4K.
Television
Television, on the other hand, is a whole other ball
of wax! As you probably know, a TV screen is basically a empty glass
box (or tube) with all the air sucked out of it. Inside the front of
this glass box it is covered with a mesh of red, green and blue
phosphor dots. At the back of the tube it has three devices (called
electron guns) that shoot three beam's of electricity at these
phosphor dots. When the electricity hits the dots they glow and a
colour picture is produced. Increase the beam strength and you can
brighten the amount of red, green or blue light produced at any part
of the screen. This, in effect, allows the colours to mix into just
about any colour and brightness imaginable. You might compare this
to mixing coloured paints together to form new colours. Whatever way
you look at it, this produces a colour picture that looks almost
like real life.
Interlace
Next is the important point! To produce a picture,
these electric beam are controlled by electromagnets to scan from
side to side across the TV screen (as illustrated in the picture
below). The beam fly across the screen in the same motion our eyes
use when we are reading a book. They start from the left, finish one
line and then shoot back to start the next line.

When TV's were invented in the 1920s the type of
phosphor used to produce the colours did not respond very fast. This
meant it was impossible to get a picture in one shot; instead we
would get a flickery strobing effect moving down the screen! To
solve this they decided that instead of putting the lines on the
screen one at a time (i.e. lines: 1, 2, 3, 4, 5) they would put them
on every other line in one pass (i.e. 1, 3, 5, 7, 9) and then
in-between the previous lines on the second pass (i.e. lines: 2, 4,
6, 8 etc.). This allowed a whole picture to be produced in two very
fast scans and allowed enough time for the slow phosphor dots to
recover. This, then, prevented any strobing effects from appearing -
success! This process is called interlacing!
Resolution

An analog TV's resolution refers to the number of
horizontal lines displayed on the screen. This is broken up into the
active and non-active areas. The non-active or blanked (A) area is not used for the actual
television picture and is basically always 'blanked'. The extra
signal information that would have been put here is often used for
closed captioning, synch info or other information such as VITC. But
obviously the bit we are interested in is the active part which
refers to where the actual picture will appear (B).
NTSC
The TV industry is dominated by two main standards for
TV design: PAL and NTSC. NTSC is one of my pet hates basically
because of it's rather low quality and use of weird framerates. NTSC
stands for the National Television Systems
Committee, it is the colour video standard used in North
America, Canada, Mexico and Japan. Some engineers have said it
should stand for Never Twice Same Color
because no two NTSC pictures look alike :). Due to the electric
system used in the US it was decided to scan the lines across the
NTSC TV screen at about 60Hz (or 60 half frames per second) which
produced 30 whole pictures every second. NTSC resolution is about
one sixth less than that of PAL - about 89 lines less. This may not
seem so bad, but divide a sheet of paper into six even parts and
chop one off of the bottom and you will have a lot of detail lost.
NTSC uses 525 horizontal lines of which only 487 make up the active
picture.
PAL
PAL stands for Phase Alternating
Line, it is the TV standard used for Europe, Hong Kong and
the Middle East. It was a new standard based on the old NTSC system
but designed to correct the NTSC colour problems produced by phase
errors in the transmission path. PAL resolution is 625 horizontal
lines but only 576 of these are used for the picture. PAL is higher
quality than NTSC, it keeps a sharper picture and remains closer to
the original format produced by motion picture cameras. Due to the
European electric standards it was decided to interlace PAL lines
every other line at 50Hz producing 25 whole frames every second.
TELECINE
This is the bit you've all been dying to read.
Unfortunately I have not written this with a bunch of amazing
solutions in mind. The idea is more to help you understand what is
going on with your video so you can decide how you will process it
better.
Just so you don't get confused you should be clear on
what the difference is between a frame and field. A 'field' is
basically every other scan line of a picture. Two fields stuck
together makes a single frame on a TV set! In the picture below only
one field is displayed on the left. Its hard to see because only
every other line is displayed. The picture on the right is a whole
frame. It is produced when we stick both fields together.
THIS IS ONE FIELD |
THIS IS TWO FIELDS |
(OR A HALF FRAME) |
(OR A WHOLE FRAME) |
 |
 |
Single fields that start from line 1 of the TV screen
are called 'odd' because they go in odd numbers (i.e. 1, 3, 5 etc.).
Fields that start from the second line to fill the gaps of the first
are called 'even' because they go in even numbers (i.e. 2, 4, 6
etc.). Fields that start from line 1 are more often also called
"Top" fields because they start from the first "top" line on the
screen. Whereas single fields that start from the second line down
are called "Bottom" fields. Okay, now everything you read should
make perfect sense =)
TELECINE
As I have already mentioned, a motion picture camera
captures its images at 24 frames every second. Each frame is a full
image. An NTSC television, however, must play 30 frames per second,
and these frames must be interlaced into two fields both top and
bottom! So basically what we are saying is we must play 60 half
frames (or fields) every second. The only way we are going to be
able to play a 24 fps motion picture on NTSC television is to change
it from 24 fps to 30 fps and interlace these frames into two fields
making 60 half frames per second. This transformation process is
done with a machine called a Telecine. A Telecine machine does
something called pulldown, which, in its simplest explanation,
"pulls down" an extra frame every fourth frame to make five whole
frames instead of four!
3:2 Pulldown NTSC
3:2 pulldown is a name that confuses people basically
because the term "pulldown" is rather ambiguous - in other words,
it's not really pulling down anything! The process sounds
complex but its really quite straightforward and I have designed a
picture to illustrate. The top row in the picture below represents
four frames from a motion picture camera. These are full frames and
not yet interlaced they are represented as A, B, C, D.
Now look at the second line in our picture below. The
Telecine machine takes the first whole frame A and splits it into
three fields (stop reading and take a look now). For the first field
it uses the top field (T) which means it takes lines 1, 3, 5
etc., from the original digitized picture. The next field taken from
A is the bottom field, so it will take lines 2, 4, 6 and so on. The
third file we see labeled (Tr) is just a copy of the first
field again (so I labeled it T(r) to mean: top
repeated').
Now the Telecine machine goes onto the next frame B.
This time it just takes the top and bottom fields. Then we move on
to the third frame C; it splits it up into three fields, bottom
(B) top (T) and a repeat of the bottom one again
(Br). Finally, the forth frame D is split into the top and
bottom fields. Thats it, that is all a telecine machine
does!

In short, this results in a field order of 3 fields, 2 fields, 3 fields, 2 fields! Or, if
its easier to understand, our picture above shows it as: 3 yellow, 2 green, 3 blue, 2 red.
So that is why it is called 3:2 pulldown, it goes in a
sequence of 3, 2, 3, 2 and so on. It can
be said to "pull down" a whole frame and split it into three fields and two
fields. Finally, after the Telecine machine has finished the
forth frame D, it will start the process all over again with the
next four pictures of the movie.
In short we end up with:
At Ab At / Bb Bt /
Cb Ct CB / Dt Db
But because it always goes: top,
bottom, top, bottom, top, bottom etc., we would just say it
without indicating top or bottom fields. So instead of the above we
would describe it as:
AAA BB CCC DD
Whatever way you look at it in the end you end up with
5 whole frames instead of 4. This turns a 24 fps movie into a 30 fps
movie!
Interlacing the picture back together
Lets look at the picture again. Look at the third line down. Here
we can see how these fields would be woven back together to produce
a whole picture again, as we would see on a TV or computer screen.
The top field of frame A is woven together with the bottom field of
frame A. Then the repeated top field of frame A is woven
together with the bottom field of frame B. The top field of frame B
is woven with the bottom of frame C. The top field of frame C is
woven together with the top repeated field of frame C. And
finally, the top field of frame D is woven together with the bottom
field of frame D.

That's quite a mouthful to explain in words but
examine the picture, it should really explain itself. Since each
frame is stuck together instead of describing telecine by saying it
uses top, bottom, top, bottom in the
order:
AAA BB CCC DD |
We would say: |
AA AB BC CC DD |
The change is only how we group the letters of course
and means nothing more.
A Weird Framerate
This is not quite the end of the saga. The old black
and white TV's used to play back at a perfectly round 30 fps. But as
usual NTSC found a way to destroy that perfection! With the
introduction of color TV it was decided (because of technical
reasons which I don't understand) that the movie must be played back
at 29.970 fps (59.94Hz) which is basically only 99.9% of its full
speed. As a result NTSC movies still have the same amount of frames
they did when they were telecined, but they are played back at a
fractionally slower rate.
2:2 Pulldown PAL
PAL movies also get telecined but not in the same way
an NTSC movie does. A Telecine machine will use what is sometimes
called 2:2 Pulldown! This basically turns every frame into two
fields so they can me played on a standard PAL television. This
makes 25 frames into 50 field which when played on a TV set at 50Hz
will produce 25 whole frames per second. So instead of going 3, 2, 3, 2, 3, 2 it will go 2, 2, 2, 2, 2, 2! This produces the fields:
At AB / Bt Bb / CT
CB / DT dB |
Or just: |
AA BB CC DD |
Again, a PAL movie will contain all the frames from a 24fps film
with no additional ones, but it will still play those frames
back faster at 25 fps. In a way of speaking it is just as correct
(or wrong) to say that a PAL movie is 24 fps because no frames have
been added to it, they are just played back faster.
INVERSE TELECINE (IVTC)
I think I'm correct in saying that there is no such thing as an
Inverse Telecine machine :). But, as the name suggests, inverse
telecine is a process that turns a 30 fps movie back into a 24 fps
movie. Basically what it does is take out all those extra fields
that were added to the movie to make it 30fps. Its about now that I
start spluttering because this is an awkward subject and I can't
find any information on exactly how Inverse Telecine is
performed! So instead I will describe what "looks" like should be
done based on how it was telecined in the first place.
Lets go back to our picture! As you can see from the second row
down, to turn the 24fps movie into 30fps we have to separated the
pictures into 10 single fields (or half frames) by adding two fields
that shouldn't normally be there. Counting from left to right, all
we would need to do to turn or 10 fields back into 8 fields (to turn
30 fps into 24fps) is to delete fields 3 and 8. Remember we are
talking fields here not frames.

But taking out fields 3 and 8 would produce a movie that had a
field order of: top, bottom,
bottom, top, bottom, top, top, bottom!
Since you cannot weave together two bottom fields or two
top fields we would need to swap them around. So imagine the
order of the numbers as:
1, 2 |
3, 4 |
5, 6 |
7, 8 |
T, B |
B, T |
B, T |
T, B |
To get the correct order we must change them to:
1, 2 |
4, 3 |
6, 5 |
7, 8 |
T, B |
T, B |
T, B |
T, B |
Which gives us an order of: 1, 2, 4, 3, 6, 5, 7, 8 which should
theoretically fix everything.
The Framerate Mystery Unraveled! 23.976 / 24 /
29.970 / 30
If the only framerates we use are 24, 25 and 29.97
then why to people speak of using 23.976? This is to do with how the
movie has been created. A 25 fps movie still has the same amount of
frames as a 24 fps movie because none have been added. But
nevertheless a PAL television chooses to play them back at 25 fps.
This makes the PAL movie play back at a slightly shorter length and
means the audio will be out of synch slightly. To compensate for
this, when a movie is telecined they apply to it what is called a
'pitch-correction' which speeds up the audio to match the playback
speed, in the case of PAL this means they perform a pitch correction
of about 4%.
The amount of frames an 3:2 pulldown telecined movie
has is 30 fps. But an NTSC television will play them back slower at
29.970 fps (59.94Hz). The amount of actual frames hasn't changed,
none have been added or taken out! Here is where the 23.976 part
comes in! If we inverse telecine a 30 fps movie we would end up with
24 fps. But if we inverse telecined a 29.970 fps movie, because it
has a slightly slower speed, instead of getting 24 fps as we should,
we will end up with the slightly slower rate of 23.976 fps.
PROBLEMS WITH INTERLACED MOVIES
Interlaced movies look fine on a standard TV but for
some unknown reason they appear terrible on a PC monitor!? Lets take
a look at our example one last time to see why. Look at the last row
where it shows how the top field of frame B is interlaced together
with the bottom field of frame C.

We are getting the top and bottom fields from two
completely different frames!! Imagine taking half of one picture and
half of the next and trying to put them together into a single
picture, its impossible! On a PC this produces what we see below.
Here we have Star Treks William Riker walking across the room from
left to right. Notice that the top field from the previous frame
shows him a little to the left and the bottom field of the next
frame shows him a little to the right. This is what produces this
combing effect and no amount of shifting the lines to the left or
the right will fix it!

Inverse Telecine Troubles
Look at our illustration one more time. A 3:2 pulldown
movie can also be encoded as 2:3 which produces exactly the
same result but is done backwards - instead of getting 3, 2, 3, 2 we will get 2,
3, 2, 3! But this doesn't matter because since a 3:2 pulldown
movie can be cut and edited after it is made the very first frame
doesn't always start with the top of field of A anyway! It
could, for example, start with the next one across - the bottom
field of A. In fact, it could start with absolutely
any of the 10 fields in the sequence!

Hence as far as I can see there must be at least 10
ways to perform inverse telecine. Five assuming the first field is
top and five assuming the first field is bottom. Let me know if you
know exactly how IVTC works and I'll update this article to explain
it better.
OTHER ISSUES
Some of the specials and extra features of a DVD seem
to have been recorded from a telecined 29.970 fps source! This means
that the interlaced picture is actually edited as an interlaced
picture on a computer and then reinterlaced again! There is
absolutely no way to fix such a problem because the lines are
literally a part of the original picture now. For example, I have
taken this frame from the trailer of one DVD and separated the
fields into two. When I squash all the lines together from one
single field I get the following picture:
Of course, I could be completely mistaken about this,
but that is what appears to be the case.
Capture Cards
Most of the Graphics Cards, TV Tuners and Video
Capture hardware we use to record video to the PC will not perform
any kind of IVTC. Neither do they seem to give a damn what order
they whack the TV fields together. This means regardless of if you
use PAL or NTSC, if you want to capture any video footage at above
240 pixels high (for NTSC, PAL is 288) you will get at least some
interlace problems! When you are capturing below 240 pixels the
capture card will only use one field and hence interlace problems
will be almost impossible. If your capture card can get larger
picture without problems check the instructions to see how its doing
it! You may find to your horror that it is actually just capturing
at 240 and enlarging the picture after it has been captured.
This is obviously a serious waste of space!
Deinterlace filters
Since to perform inverse telecine (IVTC) to make a 30
fps movie back into a 24 fps movie is so awkward there are a few
alternatives that have been designed to work on just about any
movie. There are only two types that I know:
Bobbing: To Bob basically means to
enlarge each field into its own frame by interpolating between the
lines. So from one field we are producing a full frame. Because the
top fields are a line higher than the bottom the image may appear to
"bob" but this is usually fixed by nudging the while frame up or
down a pixel. You are only really getting half the resolution with
bob but the interpolation is usually very good quality. If you are
stuck for a way to bob your video my AVISynth guide offers a bob
feature, check it out Here.
Blending: Flask Mpeg's deinterlace
filter look for the parts of a picture where the two fields do not
match and blends the combing effect together. The lower the
threshold the more the two parts are blended and the less of a
combing effect appears. The problem with this method is that the
final picture can quite often end up a bit more blurry.
DVD & TELECINE
DVD's offer a strange twist to the whole Telecine and
3:2 pulldown business. Almost all DVD's will have the movie stored
as whole pictures at 24 fps. This is the original format of the film
with no Telecine. At the start of every Mpeg-2 DVD file there are
certain header codes that tell it how to play back the DVD. Since it
is stored digitally it can give the fields or frames from the DVD
and to the hardware or software in any order it likes. It can split
the movie into two fields and perform telecine instantly. To do this
has three flags that can be applied to the header code: RFF (repeat
first field) TFF (top field first) and FPS (frames per second).
For a PAL DVD the FPS flag can be set to 25 and the
DVD will send the picture information to the hardware at 25 fps
instead of 24 fps as is stored on the DVD.
For NTSC DVD's the movie needs to be 29.970 fps so the
FPS flag is set to 29.970. But this looks odd because the movie is
over far too soon. Imagine it like playing cards, if you throw 4
cards on the floor every second the whole pack will be finished in
half the time than if you threw 2 cards onto the floor. The solution
is to telecine the movie with 3:2 pulldown to increase the amount of
"cards" we have to start with. To do this it uses the RFF and TFF
flags are set in the header code. By setting the DVD to Repeat the
First Field again you make the video display the fields in the order
3, 2, 3, 2. By setting the TFF flag you set the DVD to start from
the top field so the order always goes: top, bottom, top,
bottom.
Theoretically then, it should be possible to patch the
header code of a DVD's Mpeg-2 file and make it play back at 24 fps
instead of the 29.970 fps! In fact some people have made patches to
do this, but so far, for another unknown reason they are very
unreliable and the video turns out just as bad!
Progressive and Interlaced together!
I don't think I have mentioned what a progressive
image is yet? A progressive image is a whole frame that it is not
interlaced. Motion picture camera's capture images that are
progressive. They are not telecined or split into separate fields.
Computer monitors do not need to interlace to show the picture on
the screen like a TV does it puts them on one line at a time in
perfect order i.e. 1, 2, 3, 4, 5, 6, 7
etc.
Many DVD's are encoded as progressive pictures, with
interlaced field-encoded macroblocks used only when needed for
motion. Flask Mpeg tries to take advantage of this fact, because if
you set it to 24 fps (or 23.976) it will give the option to
reconstruct progressive images. This does not perform any
deinterlacing on the video but ignore all the flags and just reads
the DVD one progressive image at a time.
This is another confusing issue for me. I have no idea
how a DVD movie can be both interlaced and progressive other than by
the fact that a progressive movie can be played back as interlaced
due to control flags. If I learn any more about this I will update
my articles accordingly.
VHS, VCD & DVD
To finish, perhaps it would be nice to say a few words
about the video formats too. It wasn't long after TV that VHS video
recorders appeared on the scene and a yet a while latter when the
Video CD-Rom's did. Of course, there were other video formats, but
VHS (Vertical Helical Scan) and MPEG (Moving Picture Experts Group)
won the battle, at least as far as home video was concerned. This is
a little strange really because Sony's Betamax video was probably
the better quality! Anyway, all video formats to date have required
one form of compression or another to be able to record the huge
quantities of information needed to store full motion video.
VHS
VHS video is stored just like audio on a reel of
plastic tape impregnated with ground up iron. This plastic tape is
spun in front of an electromagnet that replicates the strength of
the TV's electric beam as they would appear scan across the screen.
This caused magnetic 'kinks' in the iron parts of the tape that are
almost identical to the original TV signal. A reversal of this
storage process would produce the image back on the TV screen. The
signal is simplified before it reached the tape therefore making it
take up less space.
As anyone who has ever used video tape knows it soon
looses quality. It appears grainy, looses colour accuracy and starts
to produce white glitches and audio waver - a better solution was
needed.
MPEG-1
As computer technology advanced CD-Rom video formats
became popular and the Moving Picture Experts Group designed a
compression format that could store over an hour of VHS quality
video on a single CD-ROM This soon become very popular in the east
but never truly caught on anywhere else. This was due to the fact
that recording it was difficult and slow and the quality was not
really any better than normal VHS anyway. The big big advantages of
Mpeg-1 video was that it was almost impossible for it to loose
picture quality like a VHS videotape! It could last perhaps over 100
years of use without any noticeable degradation of image
quality!
MPEG-2
Since (at the right bitrate) Mpeg-1 was able to
produce TV quality pictures superior to VHS, the Mpeg organization
decided to design another version that allowed Mpeg-1 back with
interlaced images so it could be used for TV broadcasts. This format
was called Mpeg-2. Other features were added to Mpeg-2 to make it
compress slightly better and higher quality, but the main difference
was the addition of interlace support.
Since Mpeg-1 VideoCD's showed that a CD based digital
video was not only a viable option, but also a very preferable that
is one if the storage space was enough. When CD-ROM designs were
upgraded to be able to store 4.38 gigabyte or more of information,
it was decided that these new CD's would be the new storage media
for video. It was called DVD to mean Digital Video Disc although it
was later changed to mean Digital Versatile Disc because it was
'versatile' enough to hold other data besides video.
Resolutions
Resolutions are an important issue for amateur video
enthusiasts who want to capture their video at full TV quality.
Professional video editors are told to capture at 640 x 480 pixels
for highest quality. But a PAL TV resolution is 576 lines. Then we
have the Mpeg group saying that 352 x 288 is the full VHS
video resolution! The problem seems to lie in the fact that its hard
to equate a TV resolution with a computer image. The TV is built up
of lines but the dot definition is rather "fuzzy" looking. So rather
than me rattling on about the pro's n cons here I will merely end
this article by quoting what the Ligos corporation (the creators of
the LSX Mpeg-2 encoder) say in regard to this subject:
"The resolution of computer video,
however, doesn't generally equate to the video world of
televisions, VCRs, and camcorders. These devices have
standards for resolution that are generally focused on the
horizontal resolution (the number of scan lines from
top-to-bottom that make up the picture). Here are some numbers
for comparison:
Video Format |
Horizontal Resolution
|
Standard VHS |
210 Horizontal Lines
|
Hi8 |
400 Horizontal Lines
|
Laserdisc |
425 Horizontal
Lines |
DV |
500 Horizontal Lines
|
DVD |
540 Horizontal
Lines |
With these numbers in mind, it is
important to remember this rule when bringing the worlds of
computer and video together: the quality of an image will
never be better than the quality of the original source
material.
We suggest capturing at a resolution
that most closely matches the resolution of the video source.
For video sources from VHS, Hi8, or Laserdisc, SIF resolution
of 352x240 will give good results. For better sources such as
a direct broadcast feed, DV, or DVD video, Half D1 resolution
of 352x480 is fine. There are other advantages to following
these guidelines. Your files will be smaller, consuming less
space on the hard drive or on recordable media like CD-R and
DVD-RAM. You'll also be able to encode more quickly".
|

|