Doug Kerr
Well-known member
Canon has (have, for you Brits) has introduced a new "compression" algorithm, called "Canon-log", as an alternative to the familiar "gamma precompensation" algorithm, in connection with their new line of professional digital cinema cameras. I suspect that before long, something like it will come into use in the realm of digital still photography.
I will give a brief description of it here, but first, I will give some extensive background, after which the description of the new algorithm will be almost trivial.
First, to avert misunderstanding, I need to point out that the term "compression" is used in two wholly different ways in the field of signal processing (including as it pertains to the digital representation of an image.
1. We take a suite of data, perhaps representing an audio waveform or a digital image, comprising a certain number of bits, and recode it into a form having a smaller number of bits, the objective being economy of storage or transmission, in such a way that at a later time or distant point the original suite of data can be reconstructed from the "compressed" data, either precisely or at least such that the reconstructed suite of data will describe (for example) an audio waveform or image that is, to the human listener/viewer, essentially indistinguishable from the original waveform or image.
2. We take a value having a certain range (in terms of its precision) and recode it so that it occupies a smaller range but still holds its original precision on a relative basis. (By that latter I mean, for example, what we have when we say of a certain instrument, "its precision is 2% of the reading".) The objective is that fewer bits for "sample" of the value are required than if the values were encoded directly, while still maintaining a (relative) precision consistent with the precision of the original value.
Our interest here is wholly with meaning 2.
One area in which compression of this sort came into use is in the digital representation of speech, first introduced into telephone transmission. Here, the algorithm used is a logarithmic one, which we can simplistically state as :
y = log(1+kx)/log(1+k)
where x is the input (the actual value of the variable) and y is the output (the value as stored or transmitted). The constant k is a parameter that controls the specific "curve" represented by the equation.
This simplified equation is only valid for positive values of x; in the real world, there are provisions in the equation to make it work for both positive and negative values of x.
If we encode y to a constant precision (perhaps in a fixed number of bits), then the precision of the x that it represents will be constant on a relative basis.
This is nicely consistent with human perception of sound or light, which is essentially logarithmic. That is, the smallest difference in value that the eye can distinguish is essentially proportional to the value.
Thus, the logarithmic recoding algorithm gives us the "best bang for the bit" perceptually.
By the way, in the American and international versions of this algorithm, the parameter I show as k is actually designated "mu" and "A", respectively. There are other subtle differences in the details of the two algorithms. They are usually distinguished as the "mu law" and "A-law" algorithms, drawing on the symbol arbitrarily used for the parameter in each.
But compression of a value in this sense was actually introduced earlier, in the original US scheme for the encoding of video waveform on an analog basis. But the main driver for doing was not as we suggested in connection with the encoding of audio waveforms. Rather it was the decision to precompensate, in the transmitted signal, for the nonlinear response of the display mechanism, then always a cathode-ray tube (CRT). Doing so eliminated the need for a nonlinear transfer circuit (a slightly complicated thing to make in those days) in the television receiver (where a low cost was an imperative for wide acceptance of the new system).
The response of the CRT was essentially described this way:
L = e^<gamma>
where e was the control grid voltage, L was the resulting luminance on the screen, and <gamma> (the Greek letter) was a constant. This general form is what we call a "power function", since we take a fixed power of the input to determine the output.
As a result, the transform used at the transmitter (essentially the inverse of the equation just above) came to be described as the "gamma precompensation function".
Now, did this function have the same beneficial property as the "logarithmic" transform used in digital audio, in terms of giving us the biggest perceptual bang for what we spend for precision of the transmitted signal.? Not exactly. The perception of the viewer follows (approximately) a logarithmic, not power, function. But we do get the benefit somewhat, at least compared to if we were to transmit the luminance signal in a linear way.
Now, when digital photography emerged, and standards were created for the digital representation of the image, it was concluded that it would be desirable to use a nonlinear transform of the camera's luminance "variable" in the recoded image data. Would this be logarithmic, to best match human perception? No, it would be a power function, to closely follow the TV signal convention, and - to allow the received signal to be directly applied to a CRT display without requiring any non-linear transform processing at the "receiving" end.
But the Canon cinema engineers recognized that a logarithmic, not power function, algorithm for "compressing" the signal values would be more beneficial in getting the best perceptual quality given a certain bit "quota" and considering the effects of noise as well. So they crafted a new compression algorithm, "Canon-log", based on a logarithmic compression function (There is now a new and improved version, Canon-log 2).
Those interested in the details of the algorithm, and a thorough discussion of the rationale for using it in Canon's professional cinema cameras, is given in the Canon white paper, "Canon-Log Cine Optoelectronic Transfer Function", available here:
https://www.google.com/url?sa=t&rct...FikLJ0rKFdNIH42ZQ&sig2=fKQv_W42frvJAOkjo-e7yA
A brief discussion is available here:
http://learn.usa.canon.com/resources/articles/2011/understand_log_gamma.shtml
Best regards,
Doug
I will give a brief description of it here, but first, I will give some extensive background, after which the description of the new algorithm will be almost trivial.
First, to avert misunderstanding, I need to point out that the term "compression" is used in two wholly different ways in the field of signal processing (including as it pertains to the digital representation of an image.
1. We take a suite of data, perhaps representing an audio waveform or a digital image, comprising a certain number of bits, and recode it into a form having a smaller number of bits, the objective being economy of storage or transmission, in such a way that at a later time or distant point the original suite of data can be reconstructed from the "compressed" data, either precisely or at least such that the reconstructed suite of data will describe (for example) an audio waveform or image that is, to the human listener/viewer, essentially indistinguishable from the original waveform or image.
2. We take a value having a certain range (in terms of its precision) and recode it so that it occupies a smaller range but still holds its original precision on a relative basis. (By that latter I mean, for example, what we have when we say of a certain instrument, "its precision is 2% of the reading".) The objective is that fewer bits for "sample" of the value are required than if the values were encoded directly, while still maintaining a (relative) precision consistent with the precision of the original value.
Our interest here is wholly with meaning 2.
One area in which compression of this sort came into use is in the digital representation of speech, first introduced into telephone transmission. Here, the algorithm used is a logarithmic one, which we can simplistically state as :
y = log(1+kx)/log(1+k)
where x is the input (the actual value of the variable) and y is the output (the value as stored or transmitted). The constant k is a parameter that controls the specific "curve" represented by the equation.
This simplified equation is only valid for positive values of x; in the real world, there are provisions in the equation to make it work for both positive and negative values of x.
If we encode y to a constant precision (perhaps in a fixed number of bits), then the precision of the x that it represents will be constant on a relative basis.
This is nicely consistent with human perception of sound or light, which is essentially logarithmic. That is, the smallest difference in value that the eye can distinguish is essentially proportional to the value.
Thus, the logarithmic recoding algorithm gives us the "best bang for the bit" perceptually.
By the way, in the American and international versions of this algorithm, the parameter I show as k is actually designated "mu" and "A", respectively. There are other subtle differences in the details of the two algorithms. They are usually distinguished as the "mu law" and "A-law" algorithms, drawing on the symbol arbitrarily used for the parameter in each.
But compression of a value in this sense was actually introduced earlier, in the original US scheme for the encoding of video waveform on an analog basis. But the main driver for doing was not as we suggested in connection with the encoding of audio waveforms. Rather it was the decision to precompensate, in the transmitted signal, for the nonlinear response of the display mechanism, then always a cathode-ray tube (CRT). Doing so eliminated the need for a nonlinear transfer circuit (a slightly complicated thing to make in those days) in the television receiver (where a low cost was an imperative for wide acceptance of the new system).
The response of the CRT was essentially described this way:
L = e^<gamma>
where e was the control grid voltage, L was the resulting luminance on the screen, and <gamma> (the Greek letter) was a constant. This general form is what we call a "power function", since we take a fixed power of the input to determine the output.
As a result, the transform used at the transmitter (essentially the inverse of the equation just above) came to be described as the "gamma precompensation function".
Now, did this function have the same beneficial property as the "logarithmic" transform used in digital audio, in terms of giving us the biggest perceptual bang for what we spend for precision of the transmitted signal.? Not exactly. The perception of the viewer follows (approximately) a logarithmic, not power, function. But we do get the benefit somewhat, at least compared to if we were to transmit the luminance signal in a linear way.
Now, when digital photography emerged, and standards were created for the digital representation of the image, it was concluded that it would be desirable to use a nonlinear transform of the camera's luminance "variable" in the recoded image data. Would this be logarithmic, to best match human perception? No, it would be a power function, to closely follow the TV signal convention, and - to allow the received signal to be directly applied to a CRT display without requiring any non-linear transform processing at the "receiving" end.
But the Canon cinema engineers recognized that a logarithmic, not power function, algorithm for "compressing" the signal values would be more beneficial in getting the best perceptual quality given a certain bit "quota" and considering the effects of noise as well. So they crafted a new compression algorithm, "Canon-log", based on a logarithmic compression function (There is now a new and improved version, Canon-log 2).
Those interested in the details of the algorithm, and a thorough discussion of the rationale for using it in Canon's professional cinema cameras, is given in the Canon white paper, "Canon-Log Cine Optoelectronic Transfer Function", available here:
https://www.google.com/url?sa=t&rct...FikLJ0rKFdNIH42ZQ&sig2=fKQv_W42frvJAOkjo-e7yA
A brief discussion is available here:
http://learn.usa.canon.com/resources/articles/2011/understand_log_gamma.shtml
Best regards,
Doug