This technical note provides a quick overview over some of the key ways in which the Intel® C++ Compiler for Linux* as included in the Intel® Application Software Development Tool Suite for Intel® Atom™ Processor and the Intel® Embedded Software Development Tool Suite for Intel® Atom™ Processor can be used to help improve the application execution flow you achieve on an Intel® Atom™ Processor based platform. The option settings and techniques described below are identical whether you employ them on a Microsoft Windows* or Linux* based software stack. Thus they can also be used with the Intel® C++ Compiler for Windows* in version 10.1 or 11.x.
The Intel® C++ Compiler is an optimizing compiler for Intel® architecture and compatible processor technologies. It can be installed into an existing Microsoft Visual Studio* build environment or into an existing GNU* GCC installation and used to assist you get an additional boost for your applications.
The Intel® Atom™ Processor is a new generation of low-power IA-32 based Intel® processors. Their unique design makes it recommendable to optimize your applications specifically for the Intel® Atom™ Processor. This will enable you be able to take advantage of the power savings and the execution performance of this micro architecture.
The most obvious difference to other Intel® processors is that the Intel® Atom™ Processor has an in-order instruction scheduler. This implies that instructions are fed into the instruction pipeline in exactly the order as they are fed to it by the binary code of your application. No instruction re-ordering is done at the processor level. As a result the processor is considerably more sensitive to instruction latencies and dependency stalls caused by poor instruction scheduling. Furthermore you may want you compiler to be more conservative when it comes to picking specific microcode instructions or atomic instructions depending on the memory access latency or risk for dependency stalls a specific instruction brings with it.
Lastly the simplified math instruction handling on the Intel® Atom™ processor makes for additional power savings, but it also means that the compiler has to give extra attention to the specific code generated so as to not impact execution speed of your code.
In this technical note we will briefly go over the various optimizations employed by the Intel® C++ Compiler to target the Intel® Atom™ processor. Beyond that we will however also briefly introduce the principles of interprocedural optimization and profile-guided optimization, which both can be very useful for optimizing an application targeting the Intel® Atom™ Processor as well.
Intel® Atom™ Processor Specific Optimization
The compiler optimizations specifically targeting the Intel® Atom™ Processor can be grouped into those related to the in-order instruction scheduler and thus minimizing dependency stalls caused by instruction latencies, those taking advantage of new or preferable instructions added to the instruction set and lastly those who take advantage of some of the advanced features like SSE3 instructions and bi-endianness support the Intel® Atom™ Processor shares with some other Intel® processors. Taking advantage of these features is triggered by using the
–xL (Linux*) or /QxL (Windows*)
and with the Intel® C++ Compiler 11.x also the
–xSSE3_ATOM (Linux*) or /QxSSE3_ATOM (Windows*)