The Truth About 2K, 4K and The Future of Pixels
COW Library : Cinematography : John Galt : The Truth About 2K, 4K and The Future of Pixels
Editor's Note: This article was originally posted in 2009, which means that parts of it are significantly out of date. Regardless, many of the issues raised, especially around frame rate and approaches to sensor size, are still very much being discussed all these years later. Please enjoy this as the now-historical document that it is, a snapshot of the industry in the early days of widely-accessible digital cinema cameras, without expecting it to be anything else ~Tim Wilson, Editor-in-Chief.
John Galt: "Pixel" is an unfortunate term, because it has been hijacked.
Historically, 2K and 4K referred to the output of a line array scanner scanning film, so that for each frame scanned at 4K, you wind up with four thousand red pixels, four thousand green and four thousand blue.
For motion picture camera sensors, the word "pixel" is kind of complicated. In the old days, there was a one-to-one relationship between photosites and pixels. Any of the high-end high definition video cameras, they had 3 sensors: one 1 red, a green and a blue photosite to create 1 RGB pixel.
But what we have seen particularly with these Bayer pattern cameras is that they are basically sub-sampled chroma cameras. In other words they have half the number of color pixels as they do luminance And the luminance is what they call green typically. So what happens is you have two green photo sites for every red and blue.
So how do get RGB out of that? What do you have to do is, you have to interpolate the red and the blues to match the greens. So you are basically creating, interpolating, what wasn't there, you're imagining what it is, what its going to be. Thats essentially what it is. You can do this extremely well, particularly if the green response is very broad.
Well 4K in the world of the professionals who do this, and you say "4K," it means you have 4096 red, 4096 green and 4096 blue photo sites. In other words...
Creative COW: 4000 of each. 4K.
John Galt: Right.
But if you use the arithmetic that people are using when they are taking all of the photosites on a row and saying they're 4K, they are adding the green and the blue together and saying, "Oh, there are 4K of those, so it's 4K sensor." Now actually, in order to get RGB out of a Bayer pattern you need two lines. Because you only have green plus one color (red) on one line, and green plus the other color (blue) on the other line. You then have to interpolate the colors that are missing from surrounding pixels.
Note that there are twice as many green pixels as red or blue on this representation of a Bayer pattern sensor. To create a single RGB pixel, there must be an equal number of each color, so the choice is whether to discard green pixels and lose luminance detail, or to use interpolated, aliased red and blue pixels.
Let's go back to scanning a film frame. The aspect ratio of a full 35mm film frame is basically 4x3. So if you have 4096 photo sites across the width of the film, in red and green and blue, and 3K along the height, you would have 4K by 3K. You'll have 12 million green photo-sites, 12 million blue photo-sites, 12 million red photo-sites.
That's 36 million photo-sites. A 36 mega-pixel image is what you get from a 4K scan.
Now you know very well that you cannot take a 8.3 million pixel sensor and create 36 million out of that without interpolation. You are up-converting, and there's really no value to the up-conversion. There's no new information.
So 4K is not these 8 mega pixel or 9 mega pixel or 10 mega pixel CMOS images for the Bayer pattern where they add up all the pixels in a row and say hey, we got 4K. The great perpetrators of that mythology have been RED and Dalsa. That's why I call these "marketing pixels." It's intentional obfuscation. Because they really do nothing to improve image quality. They may improve sales volume. But they don't do anything to quality.
But somehow the world has accepted that that's 4K. It's purely semantic. It's like saying, "I don't like my weight in pounds so I converted to kilos. It sounds better!" You'd be amazed at how many non-technical people I meet, often producers and directors, but sometimes even cinematographers get fooled by that stuff.
There's a fundamental problem with the Bayer sensors. I mean in 1972 when Dr. Bryce Bayer at Kodak couldn't make sensors with lots of photo-sites, his was a brilliant idea, and it works very well in still cameras. But with any camera with a fixed sampling stucture, in other words any CCD or CMOS camera with discreet photo-sites, you have to use an optical low pass filter to make sure that you don't create a moire pattern in the final image.
If you design the optical low pass filter to satisfy the requirement of the frequency of the green samples to maintain the highest resolution, the red and blue photo-sites, which are half as frequent as the green will have aliases. However, if you design the optical low pass filter to make sure that you don't get a color alias from red and blue, then you are throwing away some of the resolution from the green.
So you can never get the resolution you might expect from a Bayer pattern. Someone can argue this until they are blue in the face but we are dealing with the limitations of the physics of optics and the mathematics of sampling theory, and you can't escape it. There'll always be aliases from a device with a fixed sampling structure, such as an array of photo-sites on a sensor, if you try to record frequency information that is greater than half the number of available samples. Of course, sometimes the limitations of the camera lens acts as the optical low pass filter!!!
Now if you use the same arithmetic that these people are claiming they're 4K cameras are using, then Genesis would be 6K. Because it has 5760 pixels on one line: 1920 red, 1920 green and 1920 blue. But isn't that a little bit nonsensical? But I think it's no more nonsensical than essentially compressing the heck out of an image, then interpolating that up to create a DPX file which is enormous and say wow, we got 4K. I think that people will start to understand this and realize that it creates a terrible problem with post, because you have so much more empty data to process.
The most important issue from our point of view, is that we want to have equal resolution, TRUE edge resolution in red, green and blue. The most important thing is not to have interpolated information. You want to know that the edge is REAL.
This is because our cameras are used for doing high-end image compositing. I'm not talking about 100 people sitting at work-stations rotoscoping images. I'm talking about being able to shoot a blue screen or a green screen and using software like Ultimatte Advantage and pull perfect linear mattes from smoke, fire, transparent objects. or liquids - things that can't be roto'd.
PIXELS AND RESOLUTION
Another problem with a message built on "marketing pixels" is that it confuses pixels and resolution. They don't have anything to do with each other. What defines the resolution, quite frankly, is the optics more than the sensor.
My wife has a Ricoh GX 100. It's a beautiful little camera with a 10 million photo-site sensor. But it's not nearly as nice a picture as my old 6 mega-pixel Canon D60.
When we released the [Panavised version of the Sony] HDW-F900, dubbed "the Star Wars camera," it was a 2/3rd inch camcorder. People asked, "Why are you doing this?" Well, because it only weighs 12 pounds, it's got a battery, there's no need for an umbilical cord, and it's got a built-in recorder just like a film magazine.
Almost everyone in the industry laughed at it, but it has proved to be unbelievably successful. That camera is still renting every day with the Primo Digital lenses we designed for the 2/3" format, and really, you'd be hard pressed to get a better image. So you have to look at the whole system, not latch on to just one parameter and say "That's what we're gonna go for!" Everything has to work together as an imaging SYSTEM.
Unfortunately, one of the tragedies of digital imaging, is that now we've got these ridiculous numbers games, because so few people understand the fundamentals of the imaging technology, everybody wants a number to latch on to. The numbers don't mean anything in the context of 100 years of development of film and motion picture technology, optical technology and laboratory practice and cinematographers did wonderful work without understanding anything about the chemistry or photographic emulsion technology.
Whenever I do a presentation about digital imaging, my first question these days is, "Anybody know how many grains of silver are on a frame of film? Hands up, hands up!" Nobody ever puts their hand up. My second question is, "Hands up! Anybody ever thought about this before?" You can tell the nerds in the audience from the hands that go up!
For videos of John Galt's presentation with Canon's Larry Thorpe, "Demystifying Digital Cameras," click here.
So why do we care? Because after 100 years of being comfortable with a relatively stable film based motion picture technology, along comes this new and disruptive digital imaging technology, and we're all clutching for some magic number that we can carry around in our heads, and this will define the process for us. Sorry, it doesn't work that way. It's messy and it's complicated, and lots more so today than it was in the days of film.
4K, IMAX AND FRAME RATES
The 4K system that most people know is IMAX -- and it doesn't quite make 4K, which is a surprise to people. "How can that possibly be?," you say. "It's an enormous big frame." Well, because of what I was talking about earlier: the physics of optics. When you take the entire system into account - from the lens of the camera, to the the movement of the light through the projector, all slightly reducing resolution -- you wind up with less than the full resolution you started with.
A number of years ago some IMAX engineers - and I don't think IMAX ever let these guys out of their lab again -- did this wonderfully elegant experiment at the Large Film Format Seminar at Universal Studios Imax theatre. They showed this film they made that began with 2 rows of 2 squares: black white, white black, as if you had 4 pixels on the screen.
Then they started to double and double and double the squares. Before they got to 4K the screen was gray. Do you know what the means? There was no longer any difference between black and white, which is what allows you to see sharpness. It's the contrast that we see, not the actual information. Technically, the MTF (Modulation Transfer Function) was zero at 4K!
Let's just pretend for a moment that IMAX truly is 4K. You watch IMAX at between one and one and a half picture heights from the screen. But in order to get to appreciate 4K on a regular movie screen, you would have to sit much closer than normal. In other words, when you go to a movie theater, and most of the modern theaters with stadium seating are designed so that the middle of the theater is 2 ½ to 3 picture heights from the screen, for most of us who watch movies, that's pretty where we want to be sitting. Maybe just a little bit closer from some of us who do this for a living, because we're maybe looking for artifacts or issues. If you sit much closer than 2 ½ picture heights, that's what you're seeing, artifacts, not movies!
So if you had true 4K resolution in your local theater, everybody would have to sitting in the first 6 rows. Otherwise they wouldn't see any extra detail. Their eyes wouldn't LET them see it. You know this intuitively from passing by these beautiful new monitors at trade shows. You find yourself getting absolutely as close as possible to see the detail, and to see if there are any visible artifacts. At normal viewing distances, you can't.
So the whole 2K 4K thing is a little bit of a red herring.
Creative COW: What do you think about IMAX as a filmgoer?
John Galt: I don't like the frame rate. I saw Gorillas in the Mist and the gorilla were flying across the forest floor. Every frame they seemed to travel like 3 feet. [laughs]. It's really annoying. I mean I loved Showscan: 70mm running at 60 fps. In terms of a sense of reality, I think it was far superior to IMAX.
That's why I subscribe to Jim Cameron's argument, which is we would get much better image quality by doubling the frame rate than by adding more pixel resolution.
To many cinematographers, this is sacrilege. You often hear cinematographers saying, there's something special about movies at 24 frames per second. This may be true, but I'll tell you one of the problems of 24 fps, it's the reason we watch such a dim picture on a movie screen, because if you pump up the screen brightness, you would notice the flicker from the 24 fps motion capture.
So when you are watching in a dark surround and a dark movie theater, the eye and brain gets into this state called mesopic, that is neither photopic, which is full color vision in bright light, or scotopic which is night vision and no color. It's the in-between state between full color and no color vision. What happens there, the brain takes longer to integrate an image, so it fuses the motion better and we are less sensitive to flicker, but we also lose color acuity.
But we have to remember that 24 frames was never designed from an imaging standpoint. It was designed for sound.
The original Kinetoscope ran at approximately 48 frames per second, with rapid pull down.
The Kinetoscope, first publicly demonstrated in 1891
A Scotsman by the name of William Kennedy Dickson (below, right), working for Thomas Alva Edison, figured out that you could shoot at 16 frames per second and show the same image 3 times with a three bladed shutter in the projector. And you save film - which had to appeal to a Scotsman like Dickson! (I'm a Scotsman by the way, which is why I can make jokes about a Scotsman.)
When sound came along and they couldn't get intelligible sound at 16, they went into the next sub multiple of 48, they went to 24 frames with a 2 bladed shutter. And thats how we ended up with 24 frames. They eventually settled on 24 frames per second with a 2-bladed shutter: 48 flashes of 24 frames per second.
Now if you take a still picture of somebody walking in front of you at a 48th of second, you know that they're going to be blurred. But If we were to record 48 frames per second with a 2-bladed shutter, then the integration time would be only a 96th of a second, and each of the images would be sharper.
Recently we've been renting a camera from Vision Research called the Phantom, which easily shoots at 1000 frames per second. When you see a drop of water in a commercial fall slowly and create a lovely little splash of bubbles, that's the sort of thing shot by these high speed cameras.
Above, water balloon, after the balloon has burst, but before the water has fallen. Below, pouring liquid in a TV spot. For streamed movie clips of high-speed Phantom HD video, click here.
They are actually quite low-resolution, but because they're shooting at such a short shutter speed, they look much much sharper than cameras that have four times the resolution.
Vision Research chart on the Phantom HD digital cinema camera showing the effect of speed on resolution: 1000 frames at 2K, but to get to 2000fps, the maximum resolution is 640x480 - yet Phantom's pictures are inarguably gorgeous.
This is why I honestly think that in the future, one direction we're going to have to go is to higher frame rates, not more pixels.
Somebody said that the perfect thing would be 60 frames a second at IMAX. Kodak would love that. [laughs].
DYNAMIC RANGE & THE NONLINEAR TRANSFER FUNCTION
We think that the next improvement in digital imaging quality is being able to extend the scene dynamic range that you can capture.
We've been developing a new sensor technology called Dynamax. Now, I've been telling you that we don't need 4K -- well, this sensor is 37.5 megapixels! You basically have 6 green, 6 red, and 6 blue photosites for every pixel. Using the "NEW MATH" it is a 17K sensor!
Are you familiar at all with high dynamic range imaging in still photography? HDRI?
In the still photography world, what is going on is that people are taking multiple exposures and combining them. Let's say I do an exposure at a stop of 2.8. The next one is at 4, then 5.6, then 8, and 11. Depending on what I'm shooting, the 2.8 exposure could completely blow out the highlights, but it would have lots of shadow detail. And the f11 exposure would retain the highlights, but there would be no detail in the mid tones and the shadows. If we were to combine them, we'd have a single image with the most possible detail across the widest possible range.
Click image for larger
So in Photoshop and some other programs, you can actually blend these images to create an ultra high dynamic range image. And some of the images, you should just do a web search for high dynamic range imaging and you'll come up with what a lot of photographers have been doing. Some of the images are extraordinary, like paintings They just have extraordinary information. Some of them are quite surrealistic.
Click image for larger
Creative COW: Six images, with different exposures, combined into one.
All CCD or CMOS sensors right now are linear devices, like a light meter. One photon goes in and one goes out...well, not exactly, because the quantum efficiency is not quite 100%. But let's say that if you have 10 units of light in, you get 10 units of charge, 20 units of light, 20 units of charge, and so on. It's linear. Film emulsions don't work that way. Film has a nonlinear transfer function.
First of all, there's a kind of inertia. It takes a certain amount of light to get any exposure at all, which is why the toe is flat. What it really says is that down at the bottom of the film characteristic curve we have less sensitivity.
Curve for Panalog color space, a 4:4:4 log color space. Learn more here.
And then we get on to the straight line part of the characteristic curve, and there, it truly is logarithmic. And then we get up and when it starts to roll off again, the so-called shoulder, what is happening is you've exposed all the silver grains of low light sensitivity that require lots of light, so the sensitivity drops again, and that happens to be a wonderful thing! Because if the scene dynamic range that you attempt to capture is a bright sunny day, you can easily have 1 million to 1 dynamic range.
But I'm going to be printing this under piece of photographic paper, where the very best I'll get is 120:1. Or I'm going to watch it on a very high quality professional CRT monitor. The best it gets is 15,000:1. [Or an Optoma DLP at 10,000:1.] But we still started with a million to 1. How do I capture that?
There have been devices made with logarithmic amplifiers and other things, but they're not terribly practical. So the idea is to be able to make a sensor that has a transfer characteristic which is more like a motion picture film emulsion.
In DynaMax we can control these individual photo sites so they have a short exposure, longer exposure and so on. So we can then take those exposures and blend them together to create a high dynamic range image, just as if you were shooting half a dozen different exposures.
The DYNAMAX-35 sensor is a multimode video sensor capable of operating up to 120 fps at 6x HDTV and 30fps at full res of 37Mpix
So, yes, the Dynamax sensor is by any measure a true 4K sensor. At least in the first pass, we have no intention of trying to record every one of the 37.5 photosites as a greater number of pixels in DynaMax. It would be a waste of time. It's about getting more dynamic range
In fact, everything talking about 4K belies the fact that most of the theater installations around the world are basically going at 2K. I mean the only commercial 4K digital cinema projector that I am aware of is the Sony 4K projector. But the bulk of theatrical installations around the world are the Texas Instruments DLP. And its maximum resolution is 2048x1080. I mean, let's face it. The difference between 1920 and 2048 is 6%. Believe me, you cannot see a 6% difference. Six percent is irrelevant.
To learn more about the practicalities of digital cinema, see the Creative COW Magazine Extra, "21st Century Cinema"
Besides, when we talk about scanning film to get 4K -- we don't, really. We typically scan perf to perf, but the actual Academy Aperture is 3456 pixels, which is quite a bit less than 4K. When you scan film at 2K, you're often just scanning 1728 across the Academy Aperture. Just to make things a little more complicated!
So these are all high definition television projectors going into theaters whether you like it or not. Slightly better color gamut, but they are all basically paying lip service to the idea that it's not HD.
Its a horrible ratio anyway, 2048/1920. You want to avoid these horrible little small ratio numbers because a digital filter to scale 10920 to 2048 is difficult, and you probably lose more in the filter than you can gain by having a few more pixels.
One of the new developments for us is capturing scene metadata. So we are doing a movie just now, I'm not sure if I'm allowed to talk about it, but anyway, this is a movie that is going to be released in 3D, its actually being shot in 2D. Using our lenses that have built in encoders to encode scene metadata that will be given to the people in post-production so that they can match the 3D computer generated imagery, with 2D live action photography.
And what it means, what capturing metadata means, that people in post can render a computer generated image, that has all of the optical characteristics of the principal photography. Including the information such as a focus pull, they will have the information to pull focus in the computer generated part of the image too.
The cinematographer wants to know what the image looks like, not just on a TV monitor, but what it's going to look like on film. So we have this device called the Display Processor (below) where we can take 3D look up tables, load it into that device, feed the camera into that and then feed it into a wide, digital cinema color gamut monitor, you can emulate a particular negative printed on a particular print -- while you're shooting. This is one of the more popular pieces of hardware that we have developed. Most cinematographers go out with one of these per camera.
RENTALS, CUSTOMERS & THE DEMOCRATIZATION OF FILMMAKING
Creative COW: When people talk about an Arri D20 or a RED or whatever, one of the very first things to come up is the price of it. But thats really not a direct a factor when we talk about rentals.
John Galt: One of the interesting things about Panavision's headquarters is that we have research and development here, we have the factory for manufacturing lenses and cameras right here, and we have the rental floor. This puts us directly in contact with customers. We know what they want, because they tell us. "No, I don't want higher resolution; I'd just have to sit closer to the screen. But yeah I'd like to have more shadow detail, I'd like to have more highlight detail. Can you do that?"
Another wonderful thing about the rental business is that the whole product development process is kind of turned upside down. When you sell something, service is a profit center. When you make something available for rent, service is a COST. Because we rent things instead of selling them, our best way to keep costs down is to build to higher standards.
Lenses are a great example. A zoom lens is built nominally, put together as per the spec. What they do next over in R&D is start making micro adjustments. They have a little eccentric cam that lets them measure deviations in the angle of rotation from where the cam is supposed to be. There are often over four hundred measurements made, going for the peak performance of that zoom lens at any particular focal distance.
That lens is then taken apart, the cam goes back into the factory, and they re-cut the cams based on the test results. Sometimes we'll do that 3 or 4 times. Why? Because in doing that, we can improve the performance of the lens by 30% or more. Is it expensive? Yeah, it's ridiculously expensive. But it's not expensive over the life of the lens.
And it's not expensive when you know that that that lens will not be sitting on the shelf because a particular cinematographer doesn't like it. We have a whole floor set up at Panavision where customers test equipment every day. They will reject a particular lens not because its pictures aren't good, but because it doesn't FEEL right. That's why it's very very hard to build things for the rental market. There may be BUILDER remorse, but there is no buyer remorse. If they're not happy with something, back it goes onto OUR shelf, not theirs.
We can also develop new products in ways that aren't practical in a retail environment. So you know, the design camera for the Genesis camera was to be able to match the performance of our Millennium XL 35mm film camera, in all aspects: frame rate, size, weight, all of that. And we didn't get there. It was maybe 12 pounds, 13 pounds more than a XL with a 400 foot film magazine -- more than enough to matter to the poor bugger who has to carry it on a Steadicam all day. With a 400 foot film magazine, you're getting less than 5 minutes of recording time. That's fine for Steadicam, but we also wanted to record longer than that.
We just introduced the SSR-2, a dockable solid state recorder. We can record up to 84 minutes of uncompressed 1920x1080 at 4:2:2 or 42 minutes at 4:4:4 That requires almost three quarters of a terabyte of solid state flash memory. (We didn't consider hard drives because they just aren't reliable enough.)
Panavision Genesis with SSR-1 solid state recorder
When we started developing it three years ago, the flash memory cost alone to give you that recording time would have been $68,000! Of course, what happened during the few years of development is that the price of flash dropped to 1/10 of what it was when we started. Now, had we been building this to sell, we'd never have built it at all. It would have been totally impractical to even consider.
But if you're going to rent it out, you can look at the longer term. That expensive piece of flash memory saved us money because we never need to service it, or replace it for wear and tear. You only have so many read and write cycles, but one of our flash manufacturers calculated that we have at least 35 years! The technology will be long obsolete before then.
Creative COW: I think one of the aspect of democratization that people have missed is that products can become more and more complex and more and more sophisticated and are available to rent. Which is quite democratic indeed. I've probably got enough money to rent this thing for few months relative to buying it.
John Galt: I think it was Orson Welles who said that movie making is the only art form where the artist can't afford the material for his art. If you've got the movie that you've got to make, you can go out there and collect money, and beg, borrow and steal, and do whatever is necessary to go out and buy yourself a camera that costs less than 20 grand. Before it's useful, it's closer to 100 grand.
Or you can spend that money on your production. I think that getting your movie made and seen is probably more important than accumulating equipment.
The studios realized this a long time ago, which is why none of the studios have their own camera department anymore. It was a department like any other - props, costumes, sets, and so on.
All those studios also used to have their own service shop, they had their own cameras, their own lenses, their own tripods, their own lights -- everything.
And what the studios realized is that they didn't make enough movies to justify maintenance of that camera department.
So ultimately the camera departments disappeared, and you found that you had companies that serviced all the studios like Panavision, that were providing cameras, lenses and accessories.
Now if we don't make these lenses as well as possible, they'll sit on the shelves and gather dust. And that's true for every piece of equipment we build. There's not a lot of selling we can do. We bring people in, we show it to them, we explain what it does, they try it, if they like it they'll use it, and if they don't it sit in the shelves.
Whereas, I'm an amateur wood worker. You wouldn't believe the number of tools that I have bought over the years that I thought were just going to be wonderful for one thing or another, that just sit in the shelf to embarrass the hell out of me when I think of the money I spent for them. I just bought into some piece of advertisement or promotion and it seemed like great idea at that time. Well, our customers don't have that problem. They'll take something out, and if it doesn't work for them, it comes back, and that's it.
In the end, it's a win-win. We put a bit more into the design, manufacture and assembly process, and we get fewer equipment rejects, and fewer service problems over time. The rental environment allows you make a better product available to the customer.
At around this point in the conversation - no kidding -- we heard the intercom announce that John was needed on the rental floor! We truly appreciate how generously he shared his time with us.
John Galt is currently the Senior Vice President of Advanced Digital Imaging at Panavision's corporate office. His responsibilities at Panavision include the development of digital imaging technologies in support of Panavision's core motion picture and television production business.
Galt was project leader of the group that, with Panavision's technology partner Sony, developed the "Genesis" digital cinematography camera. Prior to Genesis, Galt was also responsible for the "Panavized" version of the Sony HDW-F900 first used by George Lucas on StarWars episode 2.
He was previously employed as Vice President, High Definition Technology Development for Sony Pictures High Definition Center. His main responsibilities were the integration of electronic and film imaging systems. This included film preservation, High Definition film transfer systems and electronic cinema. Galt was project leader of the group that designed and built the first High Definition Telecine in North America.
Prior to Joining Sony in 1988 Galt was president of Northernlight & Picture Corporation in Toronto, Canada. Northernlight & Picture was the co-producer along with the Canadian Broadcasting Corporation of "Chasing Rainbows" a fourteen-hour drama series, the first to be produced and photographed using high definition video technology. John Galt was also Director of Photography on this project.
He holds numerous U.S., British, and Japanese patents in film and electronic imaging related areas.