Building a faster LZ4 with Intel® Integrated Performance Primitives

By Zhen Zhao, Sergey Khlystov

Published:07/13/2017   Last Updated:07/12/2017

Intel® Integrated Performance Primitives (IPP) 2018 gold release starts to provide optimization for LZ4 - open source lossless data compression library. The data compression domain of the Intel® IPP library contains several functions which could speed up LZ4 data compression method for both data compression and decompression operations. In this article, the process of building LZ4 program with Intel® IPP patched library will be discussed. An simple case is also provided to measure the performance by using Intel® IPP library for LZ4 compression/decompression functions.

Intel® IPP as a component which can be found in Intel Software Products. User can access Intel® IPP 2018 from Intel® Performance Libraries, Intel® Parallel Studio XE and Intel® System Studio 2018 gold release.

Build LZ4 with Intel® IPP on Linux*/macOS*

Intel® IPP provides patch file (saved in $IPPROOT/components/components/interfaces/ipp_lz4) of drop-in optimization with Intel® IPP functions for the lz4 source code. Before building the library, you need to download lz4-1.7.5.tar.gz file from LZ4 site (https://github.com/lz4/lz4/releases/tag/v1.7.5/) and un-archive compressed file to working directory, for instance, mine is $HOME/IPP_DC. 

tar -xvzf lz4-1.7.5.tar.gz

Patching Source Code

First, set up Intel® IPP environment with following commands:

source /opt/intel/compilers_and_libraries_<version.update.build>/linux/ipp/bin/ippvars.sh <ia32|intel64>

Patch the source code using the following commands (it is assumed that lz4-1.7.5.patch file is in the working directory):

$ cd lz4-1.7.5
$ patch -p1 < ../lz4-1.7.5.patch
patching file lib/lz4.c
patching file lib/Makefile
patching file programs/Makefile
patching file tests/Makefile

Building LZ4 Library with Intel® IPP

After LZ4 source code files are patched successfully, you need to set building environment for "make" procedure. Now, you are staying in lz4-1.7.5 folder, before make lz4 libraries, you need to

  • Add the WITH_IPP definition to compiler command line;
  • Add the $IPPROOT/include directly to the list of header files search for compiler;
  • Makefiles for building of library, LZ4 utilities and tests are modified during patch process to include Intel® IPP library files. This is done, because for LZ4 makefiles are not based on LDFLAGS environment variable.

Build the library using standard make utility with the following preliminary actions:

$ export CFLAGS="-O3 -DWITH_IPP -I$IPPROOT/include"
$ make 

Build LZ4 with Intel® IPP on Windows*

To download the LZ4 source (.zip) for Widows* from same site https://github.com/lz4/lz4/releases/tag/v1.7.5/. However the patching and building processes on Windows* are different. On Windows, you could use some unzip tool like 7-zip to inflate the zip file. 

Patching Source Code

There are no standard system tools for patching on Windows* platform, you need to use 3-rd party tool. For example, GnuWin32 from Free Software Foundation. Open Command Prompt Window, with the GnuWin32 patch tool, use following command line to process:

> "C:\Program Files (x86)\GnuWin32\bin\patch.exe" -p1 --binary < <path-of-patch-file>/lz4-1.7.5.patch
patching file examples/Makefile
patching file lib/lz4.c
patching file lib/Makefile
patching file programs/Makefile
patching file tests/fuzzer.c
patching file tests/Makefile
patching file visual/VS2010/datagen/datagen.vcxproj
patching file visual/VS2010/frametest/frametest.vcxproj
patching file visual/VS2010/fullbench/fullbench.vcxproj
patching file visual/VS2010/fullbench-dll/fullbench-dll.vcxproj
patching file visual/VS2010/fuzzer/fuzzer.vcxproj
patching file visual/VS2010/liblz4/liblz4.vcxproj
patching file visual/VS2010/liblz4-dll/liblz4-dll.vcxproj
patching file visual/VS2010/lz4/lz4.vcxproj
patching file visual/VS2010/lz4.sln

*Please note: "--binary" option here is important to avoid confusion because of Linux*/Windows* "end-of-line" character difference.

Building LZ4 With Intel® IPP

To build LZ4 library and tools on Windows*, there is a Microsoft* Visual Studio* solution file and project files in lz4-1.7.5\visual\VS2010 directory. 

The patching procedure adds Intel® IPP-specific configurations Debug_IPP and Release_IPP to both win32 and x64 platforms.

These configurations are based on environment variable IPPROOT, which is set during Intel® IPP environment preparation. You can start Microsoft* Visual Studio* with IPPROOT properly set using the following steps:

  • Open "Compiler 18.0" windows for required (IA-32 or Intel®64) architecture from "Intel Parallel Studio XE 2018" menu;
  • Travel to LZ4 solution directory with "cd" command;
  • Start Microsoft* Visual Studio* IDE with "lz4.sln" command from command line window. Windows* Explorer will start installed Visual Studio* application for you.

Testing LZ4 With Intel® IPP

To test the performance of IPP enabled LZ4 compression, please follow below command:

./lz4 [arg] [input] [output]

The lz4 command will call from system path /usr/bin/,please call local built LZ4 by using ./lz4. For instance, we test with COPYING file from lz4-1.7.5/tests directory - it is a part of LZ4 standard "make test" testing - with fast compression on benchmark mode. 

> ./lz4 -b tests/COPYING
  1#COPYING           :     18092 ->     10588 (1.709), 603.1 MB/s ,3015.3 MB/s

*Configuration Info – Version: Intel® Integrated Performance Primitives 2018 Beta Update 1. Hardware: CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz, four-core CPU. Operating System: Linux version 4.8.0-54-generic (buildd@lgw01-05) (gcc version 6.2.0 20161005 (Ubuntu 6.2.0-5ubuntu12) );.

As we can see, the compression speed is 603.1MB/s and the decompression speed is 3015.3MB/s for this file. To compare the performance of original LZ4 compression/decompression, we need to modify the value of environment variable CFLAGS and unconditionally make utility to build all targets: 

> export CFLAGS="-O3"

> make --always-make

All targets of LZ4 source will be re-build without Intel IPP symbols and libraries, but just with level-3 optimization compiler option. To test the performance of original LZ4 data compression/decompression:

> ./lz4 -b tests/COPYING

 1#COPYING           :     18092 ->     10582 (1.710), 441.3 MB/s ,2584.6 MB/s

*Configuration Info – Version: Intel® Integrated Performance Primitives 2018 Beta Update 1. Hardware: CPU: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz, four-core CPU. Operating System: Linux version 4.8.0-54-generic (buildd@lgw01-05) (gcc version 6.2.0 20161005 (Ubuntu 6.2.0-5ubuntu12) );.

LZ4 Basic Functions Modified with Intel® IPP Function Calls

The current LZ4 source code patch file modifies the following set of LZ4 functions:

LZ4 Function Additonal Information
LZ4_compress_fast_extState Intel® IPP is used if acceleration argument is equal to 0 or 1
LZ4_compress_destSize_extState  
LZ4_compress_forceExtDict  
LZ4_compress_fast Calls LZ4_compress_fast_extState
LZ4_compress_destSize Calls LZ4_compress_destSize_extState
LZ4_compress_limitedOutput_withState Calls LZ4_compress_fast_extState. Obsolete function.
LZ4_compress_withState Calls LZ4_compress_fast_extState. Obsolete function.
LZ4F_compressFrame Intel® IPP function used for the 1-st frame
LZ4_decompress_safe  

To learn more information about LZ4 algorithm and benchmark, please view:

LZ4 - Extremely fast compression

lzbench for all compression

 

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804