Doug Kerr
Well-known member
Most digital cameras have, in front of the sensor array, a "spatial low-pass filter" often described as an "anti-aliasing" filter. There are a lot of misunderstandings about the role and purpose of this filter.
I will explain the concept in a different context - one in which the situation is clearer than in the digital camera case. Once the basic principle is in hand, the further complications of our situation can be better dealt with.
The context is the digital representation of an audio waveform. It begins with "sampling" of the waveform. That means we capture the instantaneous value of the source waveform repetitively, at a rate of fs (the sampling frequency). These values are then represented in digital form and stored, sent to a distant point, or such.
The Nyquist-Shannon sampling theorem tells us that, if we sample a waveform at a rate fs, and if all the frequency components of the waveform have frequencies less than fs/2, then from the suite of sample values we can reconstruct the original waveform. Not a close approximation of it, but precisely the original waveform.
The frequency fs/2 is called the "Nyquist frequency" of the particular sampling scheme.
Now suppose we present to the sampling stage of our digital system a waveform that has a component at a frequency of greater than fs/2. (The case where it is exactly the same is harder to visualize, so I just will not allow it at this point.)
What happens to this component when we attempt to reconstruct the source waveform from the suite of sample values? We have been warned by Nyquist and Shannon that it will not be handled properly (that is, we were told not to have any such). Is is just missing from the reconstructed waveform? No, much worse.
In the reconstructed waveform there will be a component, with frequency less than fs, that was not in the source waveform at all - a spurious component. Not good.
To make the particulars most clear, let's use a numeric example. Suppose fs is 8000 Hz (that is, we sample the source waveform 8000 times per second, at intervals of 125 us). Thus, the Nyquist limit is 4000 Hz: we can only expect proper behavior for a source waveform all of whose components have frequencies less than 4000 Hz.
But suppose that we somehow submit to the system for digital representation a waveform that has a component at 4500 Hz.
When we try to reconstruct the source waveform by processing the collection of sample values, we will get a waveform that includes a component with frequency 3500 Hz. That frequency is as far below the Nyquist frequency as the "rogue" component was above it.
How does this happen? Well, we find that is we sample a 4500 Hz test waveform at 8000 times per second, the sequence of values is identical to that we get if we sample a 3500 Hz waveform at 8000 times per second.
What does the "decoder" do with that? Well, if it "understood the theory", it would "know" that this sequence of sample values could come from a source component at 3500 Hz, or 4500 Hz, or in fact at an infinity of higher frequencies. So what should it do?
Well, the decoder's designers "told" the decoder it was operating in a context where no component of the waveform should exist at or above 4000 Hz. So its action is to decode that sequence of sample values into the only legitimate component they can come from: 3500 Hz.
One way to look at this is that our rogue component (at 4500 Hz) travels with the appearance of a legitimate 3500 Hz component - that is, travels under a false identiy, under an alias.
Thus, one name applied to the fact that the reconstructed waveform contains a spurious component is aliasing.
The audible result is that the reconstructed waveform is not the same as the one we wanted delivered - it is corrupted. How do we prevent this?
The solution is quite direct. We place in the path of the incoming waveform a low-pass filter that blocks all frequencies at or above 4000 Hz. Then, no "rogue" components will be present at sampling. They are indeed left behind at the vestibule.
But of course we can't construct a filter that cuts off suddenly a tiny distance below 4000 Hz. If we could, it would have undesirable side effects.
What we do is use a sampling rate of 4000 Hz in a system that will only be used to store or transport audio waveforms with components up to perhaps about 3500 Hz. Then we have from just above 3500 Hz to just below 4000 Hz for our low-pass filter to "roll off" in its response.
That low-pass filter is sometimes called an "antialiasing" filter.
What happens if we leave it out. We have the phenomenon of aliasing, and the reconstructed waveform is different from the original one we aspired to transport, often in a very troublesome way. It may sound really lousy, owing to components that aren't even related harmonically to the fundamental frequency of the original audio signal.
Well, can we take care of that later in the system? No, not really. The actual precise composition of the original waveform has been lost forever. There are of course extremely complex ways (practical with today's signal processing power) to partially suppress the audible symptoms, but they do not result in our getting back a "precise" reconstruction of the original audio waveform - just one that "sounds about the same". If the audio message is just Aunt Martha complaining about Cousin Mae, that will be fine. If it's a symphony concert on its way from the studio to the transmitter (we'll imagine a system designed with a higher sampling rate), not so fine. If it is an encoded telemetry message, not worth a damn.
Now, as I mentioned, in the digital image situation, the very same principle pertains, but with some complications. I'll go there in part 2 of this series.
Best regards,
Doug
I will explain the concept in a different context - one in which the situation is clearer than in the digital camera case. Once the basic principle is in hand, the further complications of our situation can be better dealt with.
The context is the digital representation of an audio waveform. It begins with "sampling" of the waveform. That means we capture the instantaneous value of the source waveform repetitively, at a rate of fs (the sampling frequency). These values are then represented in digital form and stored, sent to a distant point, or such.
The Nyquist-Shannon sampling theorem tells us that, if we sample a waveform at a rate fs, and if all the frequency components of the waveform have frequencies less than fs/2, then from the suite of sample values we can reconstruct the original waveform. Not a close approximation of it, but precisely the original waveform.
Note at this point that to actually achieve this, the values of the samples must be preserved "precisely". That is an issue unrelated to the topic of this note, although important in the entire story of digital representation of audio waveforms.
The frequency fs/2 is called the "Nyquist frequency" of the particular sampling scheme.
Now suppose we present to the sampling stage of our digital system a waveform that has a component at a frequency of greater than fs/2. (The case where it is exactly the same is harder to visualize, so I just will not allow it at this point.)
What happens to this component when we attempt to reconstruct the source waveform from the suite of sample values? We have been warned by Nyquist and Shannon that it will not be handled properly (that is, we were told not to have any such). Is is just missing from the reconstructed waveform? No, much worse.
In the reconstructed waveform there will be a component, with frequency less than fs, that was not in the source waveform at all - a spurious component. Not good.
To make the particulars most clear, let's use a numeric example. Suppose fs is 8000 Hz (that is, we sample the source waveform 8000 times per second, at intervals of 125 us). Thus, the Nyquist limit is 4000 Hz: we can only expect proper behavior for a source waveform all of whose components have frequencies less than 4000 Hz.
But suppose that we somehow submit to the system for digital representation a waveform that has a component at 4500 Hz.
When we try to reconstruct the source waveform by processing the collection of sample values, we will get a waveform that includes a component with frequency 3500 Hz. That frequency is as far below the Nyquist frequency as the "rogue" component was above it.
How does this happen? Well, we find that is we sample a 4500 Hz test waveform at 8000 times per second, the sequence of values is identical to that we get if we sample a 3500 Hz waveform at 8000 times per second.
What does the "decoder" do with that? Well, if it "understood the theory", it would "know" that this sequence of sample values could come from a source component at 3500 Hz, or 4500 Hz, or in fact at an infinity of higher frequencies. So what should it do?
Well, the decoder's designers "told" the decoder it was operating in a context where no component of the waveform should exist at or above 4000 Hz. So its action is to decode that sequence of sample values into the only legitimate component they can come from: 3500 Hz.
One way to look at this is that our rogue component (at 4500 Hz) travels with the appearance of a legitimate 3500 Hz component - that is, travels under a false identiy, under an alias.
Thus, one name applied to the fact that the reconstructed waveform contains a spurious component is aliasing.
The audible result is that the reconstructed waveform is not the same as the one we wanted delivered - it is corrupted. How do we prevent this?
"Doctor, doctor, when I do this my elbow hurts!"
"Well, then don't do that."
"Well, then don't do that."
The solution is quite direct. We place in the path of the incoming waveform a low-pass filter that blocks all frequencies at or above 4000 Hz. Then, no "rogue" components will be present at sampling. They are indeed left behind at the vestibule.
But of course we can't construct a filter that cuts off suddenly a tiny distance below 4000 Hz. If we could, it would have undesirable side effects.
What we do is use a sampling rate of 4000 Hz in a system that will only be used to store or transport audio waveforms with components up to perhaps about 3500 Hz. Then we have from just above 3500 Hz to just below 4000 Hz for our low-pass filter to "roll off" in its response.
That low-pass filter is sometimes called an "antialiasing" filter.
What happens if we leave it out. We have the phenomenon of aliasing, and the reconstructed waveform is different from the original one we aspired to transport, often in a very troublesome way. It may sound really lousy, owing to components that aren't even related harmonically to the fundamental frequency of the original audio signal.
Well, can we take care of that later in the system? No, not really. The actual precise composition of the original waveform has been lost forever. There are of course extremely complex ways (practical with today's signal processing power) to partially suppress the audible symptoms, but they do not result in our getting back a "precise" reconstruction of the original audio waveform - just one that "sounds about the same". If the audio message is just Aunt Martha complaining about Cousin Mae, that will be fine. If it's a symphony concert on its way from the studio to the transmitter (we'll imagine a system designed with a higher sampling rate), not so fine. If it is an encoded telemetry message, not worth a damn.
Now, as I mentioned, in the digital image situation, the very same principle pertains, but with some complications. I'll go there in part 2 of this series.
Best regards,
Doug