Performance Tools for Software Developers - Accelerating data compressing application with Intel® IPP


Overview

This application note illustrates how easy we can do a small change, and then get big performance gain into data compressing application by integrating Intel® Integrated Performance Primitives (Intel® IPP) based zlib library.

In this application note, a tool "bmp2png / png2bmp" is used. It is a pair of free, simple command-line utilities that convert between Windows* BMP format and PNG (Portable Network Graphics). Please visit b2p-home* for more information.

This guide was created for the following product releases:

  • Intel Integrated Performance primitive (Intel IPP) 5.3 beta for Linux*
  • Intel IPP sample code for zlib coding: l_ipp-samples_b_5.3.020
  • Linux zlib Compression Library: zlib 1.2 3
  • The official PNG reference library: libpng 1.2.18
  • The converter tool between .bmp and .png image: bmp2png 1.62

The application note covers both IA-32 and Intel® 64 applications. The configuration steps below provide separate instructions for each application.


Downloading application source code

  • The bmp images used in this application note are publicly available from http://premier.intel.com/ under "Intel® IPP Sample" product.

Hardware: This application note has been tested on Intel® Core®2 Duo processors.

Software: This application note applies to use GCC 3.4.6 Compiler, RedHat* Linux AS EL 4.

This application is supported by other compilers like Intel® C/C++ Compiler 10.0 for Linux*.


Configuration

In order to record bmp2png performance, we modify some parts of the bmp2png†  source code. This is an optional step. The following modification can help record zlib encode performance (include index like clocks per element, encoder time (ms) etc.).

Extract the source code bmp2png.c by--

# tar -xzvf bmp2png-1.62.tar.gz
# cd bmp2png-1.62

Add time record code in bmp2png.c

For example:

#include "common.h"
#include "bmphed.h"
#include "ippcore.h"
Ipp64u start, stop; // for record time
Ipp64u clocks=0;
char input_fn[FILENAME_MAX];
char output_fn[FILENAME_MAX];
int input_fn_size; // input file size
int output_fn_size; // output file size
int mhz; // for cpu frequency
int main(int argc, char *argv[]){

……
{ /* block */
FILE* file = fopen(input_fn, "rb");
fseek(file,0,SEEK_END);
input_fn_size = ftell(file);
fclose(file);
} /* block */
{ /* block */
FILE* file = fopen(output_fn, "rb");
fseek(file,0,SEEK_END);
output_fn_size = ftell(file);
fclose(file);
} /* block */
clocks = (stop - start);
ippGetCpuFreqMhz(&mhz);
printf("%s\t%f\t%d \t %f msec\n", input_fn, (float)((float)output_fn_size/(float)input_fn_size * 8), (int)(clocks/input_fn_size),(float)(clocks/(mhz*1000)));

return (failure > 255) ? 255 : failure;

}
static BOOL read_bmp(char *fn, IMAGE *img) {
……
strcpy(input_fn,fn);
imgbuf_init(img);
……
} static BOOL write_png(char *fn, IMAGE *img){
……
FILE *fp;
strcpy(output_fn,fn);
……
start=ippGetCpuClocks();
png_write_image(png_ptr, img->rowptr);
stop =ippGetCpuClocks();
}

Building the application

Step 1: Building Intel® IPP optimized ZLIB library.

  1. Set up your build environment for Intel IPP. Please note: If you are using GCC Compiler, you can skip the step to set environment for Intel C/C++ Compiler.
    32 bit Application:
    # source /opt/intel/ipp/5.x/ia32/tools/env /ippvars32.sh
    # source /opt/intel/cc/10.0.x/bin/iccvars.sh
    Intel 64 bit application:
    # source /opt/intel/ipp/5.x /em64t/tools/env/ippvarsem64t.sh
    # source /opt/intel/cce/10.0/bin/iccvars.sh
  • Run the build script.
    # tar -xzvf l_ipp-samples_b_5.3.036.tgz
    # cd ipp-samples/data-compression/ipp_zlib/
    #./build32.sh ( or buildem64t.sh for 64 bit application)
    Or
    # make ARCH=linux32 COMP=icc10 CC=icc CXX=icc
  • Run IPP Zlib test application and verify that the Intel IPP_zlib.a and test executable program have been successfully built.
    # cd bin/linux32 (or bin/linuxem64t for 64 bit application)
    # . ./ipp_minigzip -9 ../../../../JPEG_image/image1_1k.bmp
    # ./ipp_minigzip -d ../../../../JPEG_image/image1_1k.bmp.gz
    Compare the compress-decompressed bmp file image1_1k.bmp with original image1_1k.bmp.
  • Optional step, which requires “root” right.
    Copy the Intel IPP zlib library to system library path:
    #cp ipp_zlib/linux32/libipp_z.a /usr/lib/.
    (or /usr/lib64 for 64 bit applications)
    or create a folder “backup” and copy it there
    # cd ~
    # cd backup
    # cp ipp_zlib/linux32/libipp_z.a /home/xx/backup/.

Step 2: Build libpng and libz library

  1. build zlib source code
    # tar -xzvf zlib-1.2.3.tar.tar
    # cd zlib-1.2.3
    # ./configure
    # make
    # cp ../zlib-1.2.3/libz.a /usr/lib/.(optional or /usr/lib64 for 64 bit applications)
    Or
    #cp ../zlib-1.2.3/libz.a /home/xx/backup/.
  • build libpng1.2.18.tar.gz source code
    # tar -xzvf libpng-1.2.18.tar.gz
    # cd libpng-1.2.18
    # ./configure
    # make
    # cp .libs/libpng.a /home/xx/backup/.

Step 3: Build bmp2png converter.

  1. modify the code by following the instruction in section Configuration

    # tar -xzvf bmp2png-1.62.tar.gz
    # cd bmp2png-1.62
    # modify the code bmp2png.c
  • modify the Makefile

    # vi Makefile
    Edit the Makefile as below
    CFLAGS = -O2 -g -Wall -I/opt/intel/ipp/5.x/ia32/include
    endif
    ifndef LDFLAGS
    LDFLAGS = -L../backup -L/opt/intel/ipp/5.x/ia32/sharedlib
    endif

    LIBS = -lpng -lipp_z -lippdc –lipps -lippcore –lguide –pthread –lm
  • Build the bmp2png/png2bmp utility

    # make
    # su c “make intall'

Running the application

By default, the bmp2png/png2bmp application is installed at /usr/local/bin. Use “/usr/local/bin/bmp2png” command to run the application. By default, Intel IPP shared libraries are linked with the application. Intel IPP shared libraries must be on system's path. This can be done by invoking the appropriate batch file. For example, for IA32, you can use the following common to run the application:

# source /opt/intel/ipp/5.x/ia32/tools/env/ippvars32.sh
# ./bmp2png -9 -Oa.png ../JPEG_image/image1_1k.bmp
The result display the performance data as below
OK a.png oooooooooooooooooooooooooooooooooooooooooooooooooooooooo
../JPEG_image/image1_1k.bmp 5.397046 4 89 543.000000 msec
The firstnumber5.397046: data compress rate. Bit per byte, the smaller, the better.
The second number 489: times spend on per byte, Clocks per byte, the smaller, the better.
The third number 543ms: total time spends for convert the image. (Similar time scale as the second one).

Appendix A - Performance comparison

The algorithmic and implement of zlib are highly optimized in Intel IPP. The table below shows that, for Core 2 Duo processors, Intel IPP based zlib sample performs always 1.4x faster than Linux* zlib libraries in BMP2PNG converter. And 1.1x faster on decompress.

We run the performance test by the command

# bmp2png -9 -Oa.png ../JPEG_image/image1_1k.bmp

#./png2bmp a.png

Here we use -9: Compression level (default: -6), in order to keep similarly compress level for both IPP zlib and original zlib, and then compare the time performance.

Image Size Encoding Time (Clockticks per Byte) Encoding speedup
(IPP vs png)
Decoding Time
(Clockticks Per Byte
Decoding
speedup
(IPP vs png)
libpng zlib1.2.3 IPP zlib libpng zlib1.2.3 IPP zlib
image1_160.bmp 160x120 530 423 125% 117 87 134%
image1_320.bmp 320x240 657 501 129% 93 77 121%
image1_640.bmp 640x480 617 487 126% 81 73 111%
image1_800.bmp 800x600 609 475 127% 81 72 113%
image1_1k.bmp 1024x768 620 485 128% 88 80 110%
image1_4k.bmp 4096x4096 1,646 959 172% 141 129 109%
Average 780 557.67 140% 100.16667 200.5 116%

(Tests run on Intel Core 2 Duo 2.13GHZ processors, 1.0GB RAM memory, Red Hat Enterprise Linux AS Release 4 and Intel IPP 5.3 beta and GCC. The test bmp files are sample images with Intel® IPP JPEG samples, which can be downloaded in Intel® Premier Support website under "Intel® IPP Sample" product.


Appendix B - Verifying correctness

The following steps can be used to verify the correctness:

  1. Check if bmp2png application uses on Intel IPP library, not the default libz.a
    ldd bmp2png
    linux-gate.so.1 => (0xffffe000)
    libm.so.6 => /lib/tls/libm.so.6 (0x004bc000)
    libippdc.so.5.3 => /opt/intel/ipp/5.3_beta/ia32/sharedlib/libippdc.so.5.3 (0xf7fc7000)
    ……
  • Run image display application such as GNOME image viewer: Eye of Gnome 2.8.1 and view the bmp and png image files. The image should be correctly displayed.

Appendix C - Known issues and lim itations

There is no known issue with this release.


Appendix D - References

Operating System:

Red Hat* Linux, Red Hat* Desktop Linux* 3, Red Hat* Enterprise Linux Desktop 4, Red Hat* Desktop 3 Update 4, Red Hat* Enterprise Linux Desktop 3 Update 3, Red Hat* Enterprise Linux Desktop 3 Update 4, Red Hat* Enterprise Linux Desktop 3 Update 5, Red Hat* Enterprise Linux Desktop 4 Update 1, Red Hat* Enterprise Linux 2.1, Red Hat* Enterprise Linux 4.0, Redhat* Desktop 3 Update 5, Redhat* Desktop 3 Update 6, Redhat* Desktop 3 Update 7, Redhat* Desktop 4 Update 2, Redhat* Desktop 4 Update 3, Redhat* Desktop 4 Update 4, Red Hat* Enterprise Linux 5.0, Red Hat* Linux 6.2, Red Hat* Linux 6.2 SBE2, Red Hat* Linux 7.0, Red Hat* Linux 7.1, Red Hat* Linux 7.2, Red Hat* Linux 7.3, Red Hat* Linux 8.0, Red Hat* Linux Advanced Server 2.x, Red Hat* Linux 9.0, Red Hat* Enterprise Linux 3.0, Red Hat* Linux Advanced Server 3.x


Optimization Notice in English

标签:
如需更全面地了解编译器优化,请参阅优化注意事项