Discrete Cosine Transform Sample
Discrete Cosine Transform(DCT) and Quantization are the first two steps in JPEG compression standard. This sample demonstrates how DCT and Quantizing stages can be implemented to run faster using Intel® Cilk™ Plus. In order to see the effect of quantization on the image, the output of Quantization phase is passed on to the de-quantizer followed by Inverse DCT and stored as an output image file. DCT is a lossy compression algorithm which is used to represent every data point value using infinite sum of cosine functions which are linearly orthogonal to each other. DCT is the first step of compression in the JPEG standard. The program shows the possible effect of quality reduction in the image when we do DCT followed by quantization like in JPEG compression. To visibly see the effects if any, the inverse operations (Dequantization and Inverse Discrete Cosine Transform (IDCT)) are done and output is saved as bitmap image. This sample uses a serial implementation of the 2D-DCT (Two Dimensional DCT) algorithm, Array Notation(AN) version of the algorithm for explicit vectorization and finally the cilk_for +Array Notation version which includes both threading and vectorization solution
Code Change Highlights:
Below are some snapshots of the code changes done in the application code to gain performance.
Performance Data:
Note: Modified Speedup shows performance speedup with respect to serial implementation.
Modified fps |
Compiler (Intel® 64) |
Compiler options |
System specifications |
AN: 2.05x |
cilk_for: 4.37x |
Both: 8.01x |
|
Intel C++ Compiler 15.0 for Windows |
/O2 /Oi /fp:fast /QxAVX |
Windows Server 2012* |
2nd Generation Intel Xeon® E3 1280 CPU @ 3.50GHz |
8GB memory |
|
AN: 2.35x |
cilk_for: 3.63x |
Both: 8.53x |
|
Intel C++ Compiler 15.0 for Linux |
-O2 -fp-model fast -xAVX |
Ubuntu* 10.04 |
3rd Generation Intel Core™ i7-2600K CPU @ 3.40GHz |
8GB memory |
|
Build Instructions: