DCT, or Down the Rabbit-Hole (Part I, Correlated)

Until now we have discussed only one aspect of video encoding, namely how to eliminate time redundancy. It’s time to talk about the space, or frequency redundancy, and to find out “just how deep the rabbit hole is”.

The concept of time redundancy may appear rather vague, but it contains a simple but very important idea of strong correlation among adjacent pixels. In other words, if you pick a random pixel in the image, there is a high probability that its color will match or come very close to that of adjacent pixels. Such statistical relationship breeds information redundancy, as each next pixel contains information on its neighbors. In order to achieve compression, this redundancy has to be eliminated.

Let us consider the idea of correlation using a simple example. Let’s take 8 subsequent luminance samples with the following values:

This is a real example that provides a nice demonstration of how close the values of adjacent samples are. Let’s extract from each sample the value of the previous sample.

In such easy way we have managed to capture the correlation between the values, and considerably reduced the number of bits required to encode them.

Repeating the transformation clearly does not reduce the data values. This proves that decorrelation did take place indeed.

Please note that all the information was concentrated in the first element. This leads us to the idea that the transform is compact, that is, most of the energy is concentrated in a small number of coefficients.

Next time we will build a two dimensional transform that would be efficient, compact, and reversible.

For more complete information about compiler optimizations, see our Optimization Notice.