Emil Martinec
New member
Noise in digital signal recording places an upper bound on how finely one may usefully digitize a noisy analog signal. One example of this is 12-bit vs 14-bit tonal depth -- current DSLR's with 14-bit capability have noise of more than four raw levels, so that the last two bits of digital encoding are random noise; the image could have been recorded at 12-bit tonal depth without loss of image quality.
The presence of noise masks tonal transitions -- one can't detect subtle changes of tonality of plus or minus one raw level, when the recorded signal plus noise is randomly jumping around by plus or minus four levels. Smooth can't be smoother than the random background.
NEF "lossy" compression appears to use this fact to our advantage. A uniformly illuminated patch of sensor will have a photon count which is roughly the same for each pixel. There are inherent fluctuations in the photon counts of the pixels, however, that are characteristically of order the square root of the number of photons. That is, if the average photon count is 10000, there will be fluctuations from pixel to pixel of as much as sqrt[10000]=100 photons in the sample. Suppose each increase by one in the raw level corresponds to counting ten more photons; then noise for this signal is 100/10=10 raw levels. The linear encoding of the raw signal wastes most of the raw levels.
In shadows, it's a different story. Suppose our average signal is 100 photons; then the photon fluctuations are sqrt[100]=10 photons, which translates to +/- one raw level. At low signal level, none of the raw levels are "wasted" in digitizing the noise.
Ideally, what one would want is an algorithm for thinning the level spacing at high signal, while keeping it intact for low signal, all the while keeping the level spacing below the noise level for any given signal. NEF "lossy" compression uses a lookup table to do just that, mapping raw levels 0-4095 (for 12-bit raw) into compressed values in such a way that there is no compression in shadows, but increasing thinning of levels for highlights, according to the square root relation between photon noise and signal. Here is a plot of the lookup table values (this one has 683 compressed levels, the compression varies from camera to camera depending on the relation between raw levels and photon counts):
The horizontal axis is the compressed value of the "lossy" NEF; the vertical axis is the raw level. The blue curve is the compression lookup table; a given compressed value on the horizontal axis corresponds to the 12-bit raw level plotted. "In-between" raw levels are rounded to the nearest one in the lookup table. The plot is linear at the low end because photon fluctuations at low signal aren't big enough to permit thinning of raw levels, it then rises starting at about compressed level 285, the curve steepens as more and more levels are thinned out.
The red curve is a fit of the photon noise model of how much compression one should be able to get away with. It is the best fit of the data to
raw level = A x (NEF compressed level - B)^2
where the constant A is determined by the sensor "gain", its efficiency in capturing photons; and the constant B is an offset to account for the linear part of the curve where no compression is being done. The model is just about a perfect match to the lookup table data.
The agreement strongly indicates that Nikon engineers are using the properties of light in a clever way, to thin the number of levels in recording the raw data, using only as many as are needed to prevent posterization, allowing for the noise inherent in the light signal to dither the tonal transitions over the increasingly large gaps at higher luminance. The gaps will be undetectable because of the noise, and the retained levels encode the image data with maximum efficiency.
Rather clever IMO.
The presence of noise masks tonal transitions -- one can't detect subtle changes of tonality of plus or minus one raw level, when the recorded signal plus noise is randomly jumping around by plus or minus four levels. Smooth can't be smoother than the random background.
NEF "lossy" compression appears to use this fact to our advantage. A uniformly illuminated patch of sensor will have a photon count which is roughly the same for each pixel. There are inherent fluctuations in the photon counts of the pixels, however, that are characteristically of order the square root of the number of photons. That is, if the average photon count is 10000, there will be fluctuations from pixel to pixel of as much as sqrt[10000]=100 photons in the sample. Suppose each increase by one in the raw level corresponds to counting ten more photons; then noise for this signal is 100/10=10 raw levels. The linear encoding of the raw signal wastes most of the raw levels.
In shadows, it's a different story. Suppose our average signal is 100 photons; then the photon fluctuations are sqrt[100]=10 photons, which translates to +/- one raw level. At low signal level, none of the raw levels are "wasted" in digitizing the noise.
Ideally, what one would want is an algorithm for thinning the level spacing at high signal, while keeping it intact for low signal, all the while keeping the level spacing below the noise level for any given signal. NEF "lossy" compression uses a lookup table to do just that, mapping raw levels 0-4095 (for 12-bit raw) into compressed values in such a way that there is no compression in shadows, but increasing thinning of levels for highlights, according to the square root relation between photon noise and signal. Here is a plot of the lookup table values (this one has 683 compressed levels, the compression varies from camera to camera depending on the relation between raw levels and photon counts):
The horizontal axis is the compressed value of the "lossy" NEF; the vertical axis is the raw level. The blue curve is the compression lookup table; a given compressed value on the horizontal axis corresponds to the 12-bit raw level plotted. "In-between" raw levels are rounded to the nearest one in the lookup table. The plot is linear at the low end because photon fluctuations at low signal aren't big enough to permit thinning of raw levels, it then rises starting at about compressed level 285, the curve steepens as more and more levels are thinned out.
The red curve is a fit of the photon noise model of how much compression one should be able to get away with. It is the best fit of the data to
raw level = A x (NEF compressed level - B)^2
where the constant A is determined by the sensor "gain", its efficiency in capturing photons; and the constant B is an offset to account for the linear part of the curve where no compression is being done. The model is just about a perfect match to the lookup table data.
The agreement strongly indicates that Nikon engineers are using the properties of light in a clever way, to thin the number of levels in recording the raw data, using only as many as are needed to prevent posterization, allowing for the noise inherent in the light signal to dither the tonal transitions over the increasingly large gaps at higher luminance. The gaps will be undetectable because of the noise, and the retained levels encode the image data with maximum efficiency.
Rather clever IMO.