Probably the earliest example of the use of this effect was the design of the earliest colour television systems, which date all the way back to 1938 and the experiments of Georges Valensi. Valensi's system sent three signals as any full-colour system needs to: a conventional black-and-white picture plus a combined set of two colour-difference signals representing the blue signal subtracted from the luminance signal and the red signal subtracted from the luminance signal. Most of us will be aware that this is broadly equivalent to the mid-century colour systems of PAL and NTSC, as well as approaches to subsampled YUV or YCrCb video that persist to this day.
While Valensi's system wasn't widely adopted (it used lots of extra radio bandwidth, which nobody liked, even before the idea of mass advertising) it had the advantage of backwards compatibility with black-and white TVs. That's a trick future systems would be designed to duplicate, to avoid making everyone buy a new TV. What's crucial about it, though, is that it effectively used and uses the highest-bandwidth, black-and-white signal to encode the green information, so it's a trick that's been current since before World War 2. Valensi, we might suspect, knew that this was the best approach. The question is why it's the best approach.
The human eye
The answer lies in the makeup of the human retina, which broadly contains two types of cells, rods and cones, which (very broadly) equate to pixels on an imaging sensor. Rod cells are highly sensitive with a peak of sensitivity at or above 500nm, which is sort of the border of blue moving into green. There are around 100 million of them in the human eye (some sources say 120 million), which is way way more than the six (or so) million colour-sensitive cone cells. It's therefore tempting to leap to the conclusion that we see green sharply because there are more rods than cones and rods see green best. 100 megapixels! Holy high resolution, Batman!
That doesn't quite work out, though, because the moment we actually look at anything, we move our eyes so that the region of interest falls on the central area of the retina. Most of the colour-sensitive cone cells are in this area; there aren't so many rods. This part of the retina is good for seeing sharp detail because each of the (roughly) nerves that take signals from cone cells to the brain is only connected to one cone cell. By comparison, quite a lot of rod cells are connected to each (roughly) nerve, which is good for sensitivity, because we're adding an up signal from a lot of (kinda) pixels, but not so great for sharpness.
So that's why the central area of vision is sharper. What we've heard so far, though, suggests that our daytime colour vision should be sharper than our — in effect — monochrome night vision. It is, but it gives us no reason to assume that we should see green more sharply than red or blue. The reason for that is simply in the sensitivity curves of the three types of rod. It's often said that we have red, green and blue-sensitive rod cells, which is sort of true, but much as with a Bayer-pattern electronic image sensor, there's a lot of overlap between the three types, to the point that the medical world calls them long wavelength, (reddish), middle wavelength (greenish) and short wavelength (bluish) — but they really see a lot more than a single colour.
To see saturated colour, the brain does more or less the same sort of processing that has to happen in a Bayer-sensor camera in order to recover full-colour information. The reason this gives us best acuity in green is simply that there's really a lot of overlap between the medium and long (green and red) cones. This happens to the point where the medium-length cones can see everything from a greenish turquoise all the way through to, well, a fairly orange yellow, while the red cones can see from mid-green to the borders of infra-red. The result is an overall peak of sensitivity at a place which really looks pretty green, despite the fact that we can also see red using the same anatomy.
So, the highest-density, sharpest part of the retina is most sensitive to greenish light. It's because of this that we use a preponderance of green to form the Y channel of a YCrCb image, which is done by everything from JPEG to ProRes. It was due to all of this that the early colour television systems carried most of the green channel information in their highest quality, black and white backward-compatible picture. It's also because of this that Bayer-pattern imaging sensors use twice as many green pixels as they use red and blue pixels. Bayer himself described the green elements in his design as “luminance-sensitive” in the first sentence of the patent, and goes on to say “said luminance-type elements are sensitive in the green region of the spectrum.”
Two interesting bits of trivia arise from all this.
Firstly, the human retina is, in effect, a backside illuminated sensor. The most light-sensitive ends of the rod and cone cells are at the back, where it's harder to get the light to them. This does compromise sensitivity somewhat but also means they can be most easily supplied with the fresh chemistry to do their work.
Secondly, Georges Valensi was involved with a committee with an interest in long-distance telephony, CCIF, the Comité consultatif international des communications téléphoniques à grande distance. This, via mergers, became the International Telegraph and Telephone Consultative Committee (CCITT, in French), then the Independent Telecommunications Union. From the ITU, we get crucial standards such as Recommendation 709, which talks about how to create subsampled video where the green channel is mainly stored in the full-resolution Y plane.
Image courtesy of Shutterstock.