Replay: Phil Rhodes explains why luminance is so important within photography
The way in which electronic cameras encode luminance values has been an issue for engineers ever since the birth of electronic cameras. If you wanted a television system – that is, to send moving pictures by radio - the most obvious solution would be to take an electronic device that is sensitive to light and send its output (or the output of many) as a variation in the strength of a radio signal. Or, to store it as the strength of a magnetic field, on tape, if that was your goal.
The problem with such simplistic approaches is that things are very rarely linear – the output of your light-sensitive piece of electronics might not precisely double if you double the amount of light falling on it, and the light output of your reproduction device might not always conveniently double if you double the signal level. Although some light sensitive devices actually do behave almost as straightforwardly as that (CCDs, for one), what's certainly true is that the human eye isn't anywhere near linear: if we double the amount of light coming out of something, it doesn't look anything like twice as bright. If we take a series of grey blocks that successively double in luminance, they'll actually appear to humans as if the change is a nice continuous ramp, with the same difference in brightness between any adjacent pair of blocks. It is for this reason that photographic stops each represent a doubling of luminance, but look like a series of consistent increases. We perceive the difference between, say, F/4 and F/2.8 as the same increase in brightness as the difference between F/8 and F/5.6. In terms of absolute light intensity, though, the difference between 4 and 2.8 represents a considerably larger increase than the difference between 8 and 5.6.
This causes a problem with precision, where “precision” means that we aren't using enough of our signal – which often means not enough individual digital numbers – to represent certain differences in brightness. Assuming that we are storing an image in an 8-bit file, all of the theoretically infinite variances in luminance in a scene must be stored as one of 28=256 levels. If the image data linearly represents the absolute amount of light in the scene, a situation we call Linear Light, an excessive amount of these levels would be used to encode what, to us, looks like a very small increase in brightness at the top of the scale. In photographic terms, considering that each F-stop represents a doubling of the absolute amount of light, half of the available values (the top 128) would be used to encode the brightest single F-stop's worth of information. On an everyday camera capable of perhaps 12 stops of dynamic range, the other 11 stops, containing the vast majority of the image, would be encoded using the other, lower 128 values. Were we to store a linear-light image this way, the darkest parts of the image would suffer from awful banding (properly, quantization noise) as the small number of digital values used to store shadow detail would be used to indicate widely-spaced differences in brightness. A similar problem exists – or existed – in the analogue world, where very small signals could be subject to excessive noise and would hence degrade the image in shadowy areas.
Straightforward Solution: Gamma Encoded Images
The most straightforward solution would be to boost the level of the darkest areas of the scene before transmitting them, while leaving the brightest areas alone. The technical implementation of this is referred to as gamma correction, because the actual mathematics used are a power law, where the input is raised to a given power to create the output, and the exponent is represented by the Greek letter gamma. The result is that a graph representing the light level against the signal level is quite pronouncedly curved, very approximately like applying the curves filter in photoshop, grabbing the middle of the curve, and dragging it upward. As such, without gamma encoding, uncorrected linear-light images look very dark and gloomy when displayed on conventional monitors that expect gamma-corrected signals.
Monitor technology complicates this situation. Traditional electronic displays – CRTs, mainly – did not have anything like a linear response, which simply means that doubling the signal input didn't make the tube output twice as many photons. Through a bit of clever engineering and a lot of blind luck, it turns out that the power-law gamma encoding used to make linear-light images workable in analogue television broadcasting is almost counteracted, with surprising accuracy, by the nonlinearity of a CRT monitor, with the remaining being trimmed out electronically and the resulting image having appropriate luminance.
I use the word “appropriate” here because anyone who's ever looked at a monitor on a camera, then glanced at the actual scene, is well aware that the brightness and colour of the two do not often, if ever, look particularly similar. Until the advent of serious digital cinematography, we relied upon the cooperative design of cameras and displays to provide results that were nevertheless watchable, and to ensure that no matter what we shot, it would look, if not precisely like the original scene, at least viewable and without egregious errors.
Given this situation, it's ironic that the zenith of sensor technology during the standard-definition TV era was the charge-coupled device, or CCD, which actually has an output that's quite close to linear. Double the exposure of the image falling on a CCD, and, until it clips at maximum output, its output will very nearly double in intensity; it is a linear-light device. Now, this is not to say that the actual output of a real world CCD-based camera has these characteristics, because the camera must internally gamma-encode the images coming from its sensor so that they're compatible with the rest of the world's equipment. Of course, standard displays are generally not CRTs any more – they're more likely to be a TFT liquid-crystal display, which has its own non-linearity, but with electronics to alter the signal supplied to it until it appears to have the same performance as those old-style cathode ray tubes. So, until relatively recently, and still in broadcast television, we have had TFT monitors pretending to be cathode ray tubes, so cameras which are actually quite linear, but designed to drive long obsolete types of display, produce pictures that look right. It is perhaps not surprising that the world of cinematography prefers to avoid these machinations.