The Intel Compiler can automatically optimize a native application for the Intel® Atom™ platform, at no additional cost. This document gives an introduction to the Intel Compiler (ICC), and can be useful for people who would like to see the potential advantages to using the ICC. The following topics are covered in this short paper:
- Recommended Optimization Flags for a MID Application
- Linking to ICC Libraries
- Changing a Makefile to Build with ICC
- Speedup for using ICC
For additional information, it is recommended that the reader refer to ICC’s documentation, which is included with the installation.
To download the Intel C++ Software Development Tool Suite for Linux OS Supporting Mobile Internet Devices, head over to Intel® Software Development Products Home.
Recommended Optimization Flags for a MID ApplicationAdding optimization flags to a compiler can boost the speed at which an application runs. When using GCC to compile, there are a few recommended flags to use:
§ -O2 or -O1: O2 flag optimizes for speed, while the -O1 flag optimizes for size
§ -msse3
§ -march=core2
§ -mfpmath=sse
The Intel Compiler (ICC 10.0 and later) can make optimizations specifically for the Intel® Atom™ (MID) platform. When using ICC, the recommended switches to use for a MID application are:
- -O2 or -O1: -O2 flag optimizes for speed, while the –O1 flag optimizes for size
- –xSSE3_ATOM: makes optimizations specific to the ATOM platform. This flag only works on version 11.0 or later of ICC. If using version 10.0, the –xL flag produces a similar result
- -ipo
To use the ICC compiler, you should make sure that your build machine changes the appropriate environment variables. You will first need to locate the folder that contains the ICC compiler, and then add the appropriate information to your .bashrc file (in your home directory). For example, adding the following lines to ~/.bashrc will add the ICC (v.11) to your bash environment:
export PATH=$PATH:/opt/intel/Compiler/11.0/074/bin/ia32
export LD_LIBRARY_PATH=/opt/intel/Compiler/11.0/074/lib/ia32
export MANPATH=$MANPATH:/opt/intel/Compiler/11.0/074/man
The default behavior of the linker is to dynamically link to libraries, thus saving space in the size of an executable. In this case, you would need to manually include the .so’s with the executable for each device that an ICC-optimized application runs on.
Another option would be to statically link an application to the ICC libraries. This way, the ICC library calls are included with the executable. By passing the flag “-static-intel” to the linker, only the ICC libraries will be statically linked into an executable. The “-static” flag will statically link ALL libraries to the executable.
NOTE: Linker Flag -static-intel
This option causes Intel-provided libraries to be linked in statically.
Linker Flag –static
This option links all libraries statically into the application.
Changing from the GCC compiler over to ICC to build a Linux application generally requires making a change in the application’s Makefile. To help illustrate this, below is shown part of the original Makefile to an open source MP3 encoded called Gogo:
CC = gcc -c -I../engine
AS = nasm -i../engine/i386/
LD = gcc
MAKECFG = makecfg
LIBS = -lm
LDFLAGS = $(PROF)
CFLAGS = -Wall $(PROF) -O1 -msse -mtune=pentium4 -march=pentium4 -pedantic -pipe -fstrength-reduce -fexpensive-optimizations -finline-functions -funroll-loops -foptimize-register-move -DNDEBUG
AFLAGS = -f elf -D__unix__ $(E3DN)
An updated Makefile incorporating the ICC compiler might look like the example below. Changes are underlined:
CC = icc -c -I../engine
AS = nasm -i../engine/i386/
LD = icc
MAKECFG = makecfg
LIBS = -lm
LDFLAGS = $(PROF) -static
CFLAGS = -Wall $(PROF) -O1 -xSSE3_ATOM -ipo -no-multibyte-chars -pedantic -pipe -fstrength-reduce -fexpensive-optimizations -finline-functions -funroll-loops -foptimize-register-move -DNDEBUG
AFLAGS = -f elf -D__unix__ $(E3DN)
The actual results of using ICC will vary accordingly, but the data below represents the speedup that was gained from using ICC on the Gogo MP3 encoder. The data illustrates the amount of time that it took to encode a single WAV file to an MP3 format. The tests were run on an Intel® Atom™ 1.33 GHz processor. The process was repeated with and without Intel’s Hyperthreading (HT) technology.
The compiler flags that existed for each compiler type are shown below:
GCC-no-opt: no compiler optimizations
-Wall
GCC-orig: the original compiler flags in Gogo Makefile
-Wall -O1 -msse -mtune=pentium4 -march=pentium4 -pedantic -pipe
-fstrength-reduce -fexpensive-optimizations -finline-functions
-funroll-loops -foptimize-register-move
GCC-optimized: adding the suggested GCC compiler flags for Intel® Atom™
-Wall -O1 -msse3 -march=core2 -mfpmath=sse -pedantic -pipe -fstrength-reduce -fexpensive-optimizations -finline-functions -funroll-loops
-foptimize-register-move
ICC: adding the suggested ICC compiler flags for Intel® Atom™
-Wall -O1 -xSSE3_ATOM -ipo -no-multibyte-chars -pedantic -pipe
-fstrength-reduce -fexpensive-optimizations -finline-functions
-funroll-loops -foptimize-register-move
