Problem with Gamma Encoded Images in Digital Cinematography and a Possible Solution
Other than simple perversity of design, the reason that gamma encoded images are bad for digital cinematography is that in order to create a viewable image without the need for manual colour grading – which is impractical in applications such as multi camera studio production and electronic news gathering – certain assumptions must be made about how the image should look. This is particularly true with regard to the brightest areas of the image, which are compressed the most by gamma encoding as that gamma curve approaches the horizontal toward the top of the graph. Where a production does want to grade, options are limited by gamma encoding, especially since the highlight region is where electronic cameras traditionally compare less well to the photochemical film stocks they're replacing.
Solving this problem is simply a matter of choosing not to gamma-encode the data coming from the imaging sensor, although storing linear-light information has its own concerns. First, and most critically, modern CMOS sensors, which means most current and upcoming digital cinematography cameras, don't actually produce linear-light data. We must be careful here as real-world CMOS sensors may include a significant amount of image processing electronics within the device which might cause the data to appear linear. However, the actual light sensitive elements of most CMOS sensors, which are photodiodes operated in the reverse-biased mode, have a response to light which increases the signal proportionally to the square root of the amount of light that hits it (there are almost as many varieties of CMOS sensor as there are CMOS sensors, so these notes apply only to the most basic expression of the technology). Therefore, any CMOS camera claiming to be storing “raw” sensor information in anything approaching linear light must be processing its information to at least some degree. We must be cautious when considering what the performance of a sensor is, as opposed to the changes made to that performance by associated electronics, even if that processing is incorporated in the same physical piece of silicon as the sensor photodiodes.
As we saw above, performing a gamma encoding prior to storing the image, as opposed to doing so on recovery, can make image storage practical in an 8-bit image. But as we also learned, it can make grading more difficult. There are two solutions: either use more bits, so that there are enough luminance levels available to overcome the problems associated with the brute-force approach of storing a linear light image, or modify the image somehow such that it makes better use of the available digital luminance levels, without affecting grading.
If we store an image with at least 12 bits of precision, providing 212=4096 luminance levels, it may become practical to work with a linear-light image. This is why the internal design of television cameras is often 14 or 16 bit, to provide enough precision to apply gamma to the (often nearly linear, for CCDs) sensor data directly, avoiding quantization noise and other precision problems associated with rounding-off the results of digital mathematics. Eventually, in postproduction, or ini monitors on set, it will be necessary to apply some form of gamma-like modification to the data to view the image, but advances in data storage techniques such as flash mean that storing 12, 14 or 16-bit linear-light images is more practical than ever. The advantages of doing so are in simpler, potentially slightly less power-hungry cameras and the ability to do certain types of mathematics – such as colour balancing – without having to do any preprocessing.
An Alternative Solution
The other solution to the problem of storing linear-light images is to apply some form of amplification to the darker values, representing shadow detail in the scene, such that they are represented by a reasonable number of digital luminance levels, but to do this without causing the problems of gamma correction. In this situation, “reasonable” might mean that a change in image brightness as perceived by the human visual system is represented by an equal change in the numeric luminance value, regardless of the absolute light level in the scene. To put it another, perhaps simpler way, we want the bottommost stop of the image to be represented by the same number of digital counts as the uppermost stop.
This ideal can be closely approximated using a logarithmic curve, matching the behaviour of F-stops wherein each doubling of light appears as a consistent increase to the eye. The use of logarithmic encoding for images was originally developed by Kodak for its Cineon film scanning system, and some camera systems provide options to record images suitable for use in post production procedures such as colour grading systems which expect Cineon data.
There are a number of potential stumbling blocks for users of all these techniques. First, the terminology is confusing: when the industry began using logarithmic encoding, universally referred to as “log”, it was natural to begin referring to non-log images as “linear.” But of course, most video images are gamma-encoded, often for display on conventional video hardware, and are nothing like linear with respect to the absolute amount of light in the original scene. It is for this reason that we use the special term “linear light” to refer to actually-linear images.
The final great confusion of all this is that real cameras very rarely produce truly linear, truly logarithmic, or truly linear-light data. The camera manufacturer does not generally have access to actual linear data to begin with, so anything described as “linear light” is more properly described as “data processed to approximate linear light.” This being the case, it is almost always necessary for camera manufacturers to perform processing on the image data, and in doing so they are often tempted to make changes which they feel make for better performance. This is very much the case in gamma-encoded conventional video cameras, which usually provide user-accessible features such as auto knee to allow for various different highlight characteristics. It is also the case in log-encoding devices, which is why postproduction people must concern themselves not only with what system is in use – log, linear, linear light – but also with what kind of log is in use. Logarithmic images are very rarely based on a simple logarithm of the linearised image data.
Given all this, one might yearn for a return to the days of television, video and CRTs, or perhaps to the world where film looked the way it looked based on how it was manufactured and processed. It's also reasonable to expect all this variability to settle down into standardisation, much as the various gauges of film did in the latter part of the 19th century and the first few decades of the 20th. Even so, this all represents a rather complex situation with the potential to cause people serious problems with images that just don't look right, and for the time being at least, a cautious and methodical approach remains necessary.
I'm indebted to David Gilblom of Alternative Vision Corporation for his notes on CCD and CMOS sensor performance which I used in the preparation of this article.