Building a faster ZLIB with Intel® Integrated Performance Primitives

签署人: Zhen Zhao

已发布:03/16/2017   最后更新时间:03/16/2017

Intel® Integrated Performance Primitives (IPP) is available from the free Intel® performance libraries program.  Since 2017 Update 2 release, the libraries can be easily installed via YUM/APT repositories. Intel IPP also provides optimization for some open source library, for instance the Zlib library for Data compression. In this article, the process of installing Intel IPP through using YUM/APT repositories and building zlib with Intel IPP library will be discussed.  An simple example is also provided to the users to measure the performance of the zlib compression/decompression functions.

Intel® IPP Installation via YUM/APT

Since version of 2017 gold release, the repositories of Intel Performance Library is available for APT and YUM-based distributions. The Performance Library package includes Intel MKL, IPP, TBB and DAAL components. Customer could only download Intel IPP package individually to use. Please refer following link to learn more about accessing Intel product from YUM/APT package manager:

Installing Intel® Performance Libraries and Intel® Distribution for Python* Using YUM Repository
Installing Intel® Performance Libraries and Intel® Distribution for Python* Using APT Repository
Installing Intel® Parallel Studio XE Runtime 2016 Using YUM Repository

After installing/ upgrading Intel Performance Library components, the location of the tool would be:

/opt/intel/parallel_studio_xe_<VERSION>.<UPDATE>.<BUILD-BUMBER>

You could load IPP functions file into the current shell script by using source command:

source /opt/intel/parallel_studio_xe_<VERSION>.<UPDATE>.<BUILD_NUMBER>/psxevars.sh <ia32|intel64>

Or

source /opt/intel/parallel_studio_xe_<VERSION>.<UPDATE>.<BUILD_NUMBER>/compilers_and_libraries_2017/linux/ipp/bin/ippvars.sh <ia32|intel64>

ZLIB optimization with Intel® IPP

Intel provides patch files for the zlib* source to provide drop-in optimization with Intel® IPP functions. The patches now support zlib version 1.2.5.3, 1.2.6.1, 1.2.7.3 and 1.2.8. Customer could use zlib function with IPP optimization to improved performance of the data compression functions without converting zlib source codes to IPP program. The data compression domain of the Intel® IPP library contains several functions that can speed up the well-known Zlib library (http://zlib.net) in both data compression and decompression operations. 

Downloading ZLIB source code

Before building the library, you need to download Zlib source code files from the Zlib site (http://zlib.net/fossils/). The following table provides links to specific Zlib versions that can be updated with Intel® IPP library function calls. 

 

Zlib Version Web address
1.2.5.3 http://zlib.net/fossils/zlib-1.2.5.3.tar.gz
1.2.6.1 http://zlib.net/fossils/zlib-1.2.6.1.tar.gz
1.2.7.3 http://zlib.net/fossils/zlib-1.2.7.3.tar.gz
1.2.8 http://zlib.net/fossils/zlib-1.2.8.tar.gz

Unzip the downloaded archive with the following command on Linux*/macOS* systems:

tar -xvzf <archive file name>

This command will create the zlib-<version> folder in your working directory, where <version> is selected Zlib version number from 1.2.5.3 to 1.2.8. For example, I saved zlib-1.2.8.tar.gz in $HOME, the decompressed folder would be in $HOME/zlib-1.2.8.

Patching Source Code

Firstly, source IPP root path by following commands:

source /opt/intel/compilers_and_libraries_2017.2.174/linux/ipp/bin/ippvars.sh <ia32|intel64>

Change path to zlib-<version>  folder. Apply the corresponding source code patch file zlib-<version>.patch from Intel® IPP product distribution with the following command:

patch -p1 < "path to corresponding patch file"

The IPP corresponding patch file normally saved in $IPPROOT/examples/components_and_examples_lin_ps/components/interfaces/ipp_zlib/ while customer access Intel IPP with Intel Parallel Studio and installed the Intel IPP by default path.

If the patching process completed successfully, you will see the following messages:

patching file adler32.c
patching file crc32.c
patching file deflate.c
patching file deflate.h
patching file inflate.c
patching file inftrees.h
patching file trees.c
patching file zlib.h

Configuring and Building Zlib Library

After Zlib source code files are patched successfully, you need to create a makefile for building process. Now, you are staying in zlib-<version> folder, before make zlib libraries, you need to add the WITH_IPP definition to compiler command line, add the $IPPROOT/include directory to the list of header files search for compiler, and Add Intel® IPP data compression corresponding libraries to the list of ld input files. To finish above work, please create a bash file called ipp_config in current work directory, and write commands lines like below for ia32:

# Build dynamic Zlib library with static Intel(R) IPP linkage on Linux*
$ export CFLAGS="-m32 -DWITH_IPP -I$IPPROOT/include"
$ export LDFLAGS="$IPPROOT/lib/ia32/libippdc.a $IPPROOT/lib/ia32/libipps.a $IPPROOT/lib/ia32/libippcore.a"
$ ./configure
$ make shared

*Please note, on MacOS* there is no separation between 32- and 64-bit libraries. However, in Linux* system, need to specify different set of libraries for different architectures. If you are using Intel64 architecture, please modify -m32 to -m64, and change the path for IPP library dependencies from ia32 to intel64

In above command line, we provided the way to build zlib with Intel IPP static libraries. Customer also could link with IPP dynamic files by following commend for Intel64:

#Build dynamic Zlib library with dynamic Intel(R) IPP linkage on Linux*
$ source /opt/intel/compilers_and_libraries_2017.2.174/linux/bin/compilervars.sh intel64
$ export CFLAGS="-m64 -DWITH_IPP -I$IPPROOT/include"
$ export LDFLAGS="-L$IPPROOT/lib/intel64 -lippdc -lipps -lippcore"
$ ./configure
$ make shared

*Please note, the static(libz.a) and dynamic(libz.so) zlib library are generated in current work directory, but not in /usr/local/zlib/lib. 

How to check if the zlib dynamic library linked with Intel IPP successfully, you could check with DLL dependencies of libz.so, the result should be like:

ldd libz.so
        linux-vdso.so.1 =>  (0x00007fffea9fd000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f8890f14000)
        libippdc.so => /opt/intel/compilers_and_libraries_2017.2.174/linux/ipp/lib/intel64/libippdc.so (0x00007f8890d0d000)
        libipps.so => /opt/intel/compilers_and_libraries_2017.2.174/linux/ipp/lib/intel64/libipps.so (0x00007f8890ac4000)
        libippcore.so => /opt/intel/compilers_and_libraries_2017.2.174/linux/ipp/lib/intel64/libippcore.so (0x00007f88908b7000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f8891519000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f88906b3000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f88903b1000)

*Please note, if you build libz.so with IPP static library, you will find it depends on system dynamic libraries only. Ldd not intend to display linked static library, it doesn't mean you linked with IPP failed. The size of the resulting libz library in this case will be more than 700 KBytes if you linked with IPP successfully because it contains optimizations for all CPUs supported by Intel® IPP.

Using Static libz.a Library to Build User Applications

We provide a simple test case zpip.c for performance measurement of Data Compression/Decompression (deflate/inflate). To compile the source file, please use following command to enable IPP usage.

For static linking to Intel IPP on Linux* OS, 

gcc -O3 -o zpipe_ipp.out zpipe.c -I$IPPROOT/include $HOME/zlib-1.2.8/libz.a $IPPROOT/lib/intel64/libippdc.a $IPPROOT/lib/intel64/libipps.a $IPPROOT/lib/intel64/libippcore.a

For dynamic linking to Intel IPP on Linux* OS,

gcc -O3 -o zpipe_ipp.out zpipe.c -I$IPPROOT/include $HOME/zlib-1.2.8/libz.a -L$IPPROOT/lib/intel64 -lippdc -lipps -lippcore

To test the difference of using Intel IPP optimization or not, you could also build with original zlib library to test,

gcc -O3 -o zpipe.out zpipe.c -I/usr/local/zlib/include /usr/local/zlib/lib/libz.a

In this article, the canterbury corpus files have been used for performance testing. We select various file type to test how IPP improved performance:

For compress input file to compressed file, the elapsed time costs for deflate will be printed,

./zpipe_ipp.out <input> output

For decompression, the elapsed time costs for inflate will be printed,

./zpipe_ipp.out -d <input> output

The performance comparison has been listed here. In this test case, we test average elapsed time of deflate & deflate with 100 times loop. Compression ratio also being compared to show in following table.

System Info: Intel(R) Xeon(R) CPU E5-2699 v3, Linux 3.10.0-327.el7.x86_64
Testing info: single thread, Zlib-1.2.8, IPP2017update2

File Processing Zlib Perf. IPP Perf.  Zlib Ratio IPP Ratio
plrabn12.txt(461KB) Deflate 31.72ms 16.58ms 2.43: 1 2.45: 1
ptt5(502KB) Deflate 10.31ms  4.27ms 9.09: 1 9.31: 1
kennedy.xls(1006KB) Deflate 32.95ms  18.54ms 5.05: 1 4.89: 1

 

Learn more information about Intel IPP data compression, please refer:

Intel® Integrated Performance Primitives (Intel® IPP) 2017 Release Notes
Intel® IPP ZLIB Coding Functions
 

产品和性能信息

1

英特尔的编译器针对非英特尔微处理器的优化程度可能与英特尔微处理器相同(或不同)。这些优化包括 SSE2、SSE3 和 SSSE3 指令集和其他优化。对于在非英特尔制造的微处理器上进行的优化,英特尔不对相应的可用性、功能或有效性提供担保。该产品中依赖于微处理器的优化仅适用于英特尔微处理器。某些非特定于英特尔微架构的优化保留用于英特尔微处理器。关于此通知涵盖的特定指令集的更多信息,请参阅适用产品的用户指南和参考指南。

通知版本 #20110804