Critical frequency

edit
 
A family of sinusoids at the critical frequency, all having the same sample sequences of alternating +1 and –1. That is, they all are aliases of each other, even though their frequency is not above half the sample rate.

The Nyquist rate is defined as twice the bandwidth of the continuous-time signal. The sampling frequency must be strictly greater than the Nyquist rate of the signal to achieve unambiguous representation of the signal. This constraint is equivalent to requiring that the system's Nyquist frequency (also known as critical frequency, and equal to half the sample rate) be strictly greater than the bandwidth of the signal. If the signal contains a frequency component at precisely the Nyquist frequency then the corresponding component of the sample values cannot have sufficient information to reconstruct the Nyquist-frequency component in the continuous-time signal because of phase ambiguity. In such a case, there would be an infinite number of possible and different sinusoids (of varying amplitude and phase) of the Nyquist-frequency component that are represented by the discrete samples.

As an example, consider this family of signals at the critical frequency:

 

Where the samples

 

are in every case just alternating –1 and +1, for any phase θ. There is no way to determine either the amplitude or the phase of the continuous-time sinusoid x(t) that x[n] was sampled from. This ambiguity is the reason for the strict inequality of the sampling theorem's condition.

Mathematical basis for the theorem

edit
 
A Dirac comb, modulated by the sample values of a signal

The Nyquist–Shannon sampling theorem states that, given a bandlimited continuous-time signal x(t) that is uniformly sampled at a sufficient rate, even if all of the information in the signal between samples is discarded, there remains sufficient information in the samples that the original continuous-time signal can be mathematically reconstructed perfectly from only those discrete samples. To prove this, a different function is first constructed, conceptually, from the whole original signal, but preserving information from just the sample instants:

 
x(t) is the original continuous-time signal.
xs(t) is a function that depends only on the values of x(t) at discrete moments of time
ШT(t) is the sampling operator called the Dirac comb and, being periodic with period T, can be formally expressed as a Fourier series:
   
 
 
fs = 1/T is the sampling frequency and is the fundamental frequency of the periodic function ШT(t).
δ(t-nT) is a dirac impulse delayed to time nT.
The (implied) limit in the Fourier summation is not in the pointwise sense but in the sense of tempered distributions, see also Dirichlet kernel.

Since the Dirac impulse is zero except where its argument is zero, ШT(t) takes a value of zero except for values of t that are at the sampling instants, nT, for integer n. Therefore xs(t) also takes on zero values for all t except for the sampling instants nT. Multiplying x(t) by ШT(t) effectively discards all of the information between sampling instants and retains information only at the sampling instants nT. xs(t) can be represented in terms of the samples:

   
 
 
 

where x[n] = x(nT) are the samples. The sequence of sample impulses xs(t) can also be written in terms of the Fourier series of the Dirac comb,:

   
 

Using the frequency shifting property of the continuous Fourier transform,

   
 
 
 

where X(f) is the Fourier transform of x(t). This says that the spectrum of the baseband signal being sampled is shifted and repeated forever at integral multiples of the sampling frequency, fs. These repeated copies are called images of the original signal spectrum.

Now constrain x(t) to be bandlimited to B (that is, X(f) = 0 for all |f| > B), and consider what condition precludes overlapping of the adjacent images X(f-kfs) :

right edge of kth image of X( f )   left edge of (k+1)th image
   
   
   

With that condition satisfied, there is no overlap of images in Xs(f) and X(f) (and thus x(t)) can be reconstructed from Xs(f) (or xs(t)) by low pass filtering out all of the images of X(f) in Xs(f) except for the original image at the baseband. To do that, fs > 2B (to prevent overlap) and the frequency response of the reconstruction filter H(f) must be:

 

The reconstruction low-pass filter transition band is between B and fs-B and the filter response need not be precisely defined in that region (since there is no non-zero spectrum in that region). However, the worst case is when the bandwidth B is virtually as large as the Nyquist frequency fs/2 and in that worst case, the reconstruction filter H(f) must be:

 

where   is the rectangular function.

With H(f) so defined, it is clear that

 
 
Spectrum, Xs(f), of a properly sampled bandlimited signal (blue) and images (green) that do not overlap. A "brick-wall" low-pass filter, H(f), removes the images, leaves the original spectrum, X(f), and recovers the original signal from the samples.

and the spectrum of the original signal that was sampled, X(f), is recovered from the spectrum of the sampled signal, Xs(f). This means, in the time domain, that the original signal that was sampled, x(t), is recovered from the sampled signal, xs(t).

This completes the proof of the Nyquist–Shannon sampling theorem. It says that if the sampling frequency, fs, is strictly greater than twice the bandwidth, B, of the continuous-time baseband signal, x(t), then no information is lost (or "aliased") by sampling.

To reconstruct x(t) from the samples x[n], a reconstruction filter (a brick-wall low-pass filter) with response H(f) is constructed. The impulse response of the reconstruction filter is the inverse Fourier transform of H(f):

   
 
 
 
 
 
 
 ,   in terms of the normalized sinc function.

This function is the impulse response of the reconstruction filter with input the sampled signal xs(t), which is just a collection of dirac impulses, δ(t-nT), each delayed to the time of their sampling instance, nT and weighted by a value proportional to the value of the continuous-time signal that was sampled at that instance, x[n]=x(nT). Since the reconstruction filter is a linear, time-invariant system, each impulse at time nT generates its own impulse response delayed to the same time, and the output of the reconstruction filter is the sum of outputs driven by each weighted impulse separately. For each input impulse, the component of the output is the impulse response delayed to the same time of that input impulse, h(t-nT), and weighted by the same coefficient attached to that input impulse, Tx[n]. That is, the output of the reconstruction filter is:

   ,   where   is the convolution operator
 
 
 
 
 

This shows explicitly how the samples x[n] are combined to reconstruct the original function x(t).