Intel® C++ Compiler for Mac OS X* Knowledge Base

Submit New Article



Title Modified Date
Being Successful with the Intel® Compilers -- You Need to Know
Tips and techniques on using the Intel Compilers to maximize your application performance.
Author: Brandon Hewitt (Intel),Vangal Venkatesh
Type: Performance and Optimization
11/02/2009
Performance Tools for Software Developers - Loop blocking
Loop blocking is a combination of strip mining and loop interchange to enhance reuse of local data. It helps the nested loops that manipulate arrays and are too large to fit into the cache. The loop blocking allows reuse of the arrays by transforming the
Type: Performance and Optimization
07/13/2009
Performance Tools for Software Developers - Auto parallelization and /Qpar-threshold
The article describes effect of /Qpar-threshold option when doing auto parallelization with Intel C++ compiler.
Author: Om Sachan (Intel)
Type: Performance and Optimization
07/13/2009
Intel® compiler options for SSE generation (SSE2, SSE3, SSSE3, SSE4) and processor-specific optimizations
Explains which Intel Compiler switches to use to target and optimize for a specific platform, cpu or processor.
Type: Performance and Optimization
07/13/2009
An Overview of the Parallelization Implementation Methods in Intel(R) C++ Compilers
Description of the various ways you can use Intel® C++ Compilers to enable your applications for multi-core and many-core.
Author: Mark Sabahi (Intel)
Type: Performance and Optimization
06/19/2009
IA-32 and Intel®64 Processor Targeting Overview
The compiler supports many options that tune or optimize an application for different Intel and non-Intel processors. Differences are explained, and the switches /arch, /Qx..., /Qax... (Windows*) and -m, -x..., -ax... (Linux*, Mac OS* X) are recommended.
Author: Martyn Corden (Intel)
Type: Performance and Optimization
04/06/2009
OpenMP* Loops with Function Calls for Bounds May Not Parallelize
Loops with function calls as bounds (such as STL end() calls) may not be compiled into parallel even if OpenMP APIs are explicitly used.
Author: Brandon Hewitt (Intel)
Type: Performance and Optimization
03/12/2009
Intel® Fortran Compiler - Automatic CPU Dispatch For Multiple CPU Types
In 11.0 a new schema for the processor targeting switches is introduced for easy of use and remembering. The new options /QaxSSE4.1, /QaxSSSE3 and /QaxSSE3 are equivalent to /QaxS, QaxT and / ...
Type: Performance and Optimization
03/01/2009
Disable movbe to Test Intel® Atom™ Processor Targeted Code on non-Intel® Atom™ Processor Platforms
For those looking to validate code targeted for the Intel® Atom™ processor on other Intel Architectures, a mechanism to disable generation of the movbe instruction is provided.
Author: Brandon Hewitt (Intel)
Type: Performance and Optimization
02/20/2009
Requirements for Vectorizable Loops
Vectorization is one of many optimizations that are enabled by default in the latest Intel compilers. In order to be vectorized, loops must obey certain conditions, listed below. Some additional ways to help the compiler to vectorize loops are described.
Author: Martyn Corden (Intel)
Type: Performance and Optimization
01/29/2009
HPCC-stream performance loss with the 11.0 compiler
The STREAM component of the HPCC benchmark suite runs more slowly with the initial release of the 11.0 compiler compared to version 10.1. This can be worked around by disabling function inlining.
Author: Martyn Corden (Intel)
Type: Performance and Optimization
12/19/2008
Compiler 11.0 OpenMP programs exhibit high core usage
Compiler 11.0 C++ and Fortran OpenMP-enabled programs on Windows, Linux, and Mac OS X have high CPU utilization for all cores in a multi-core system, even if it is known that not all cores are being fully utilized. Setting KMP_BLOCKTIME has no effect.
Author: pbkenned
Type: Performance and Optimization
11/20/2008
High Clocks Per Instruction Retired when vectorizing the loop.
Sometimes when we vectorize a loop, we get a high Clocks Per Instruction Retired (CPI) value. This happens when there is high bus utilization and the bus gets saturated.
Type: Performance and Optimization
11/18/2008
Performance Tools for Software Developers - SSE generation and processor-specific optimizations continue
Can I combine the processor values and target more than one processor? How to generate optimized code for both Intel and AMD* architecture? Where can I find more information on processor-specific optimizations?
Type: Performance and Optimization
11/06/2008
Intel® C++ Compiler - Consistent Use of Compiler Options in Compile/Link Phase
If you are compiling applications with a separate compile and link process, the optimization options in the compile/link phase should match, especially when using openmp, parallelization, vecto ...
Type: Performance and Optimization
09/19/2008
Intel® Fortran Compiler - Training courses
Intel offers training courses designed to help software developers become productive and to improve application performance with the Intel® C++ and Intel® Fortran Compilers for Windows*, Lin ...
Type: Performance and Optimization
09/19/2008
Performance Tools for Software Developers - How to generate optimal code for Intel® processors running Mac OS* X?
The Intel® Compilers 10.0 for Mac OS* X running on systems with IA-32 architecture will include the option -xP at default optimization to automatically vectorize code and generate SSE3, SSE2, and SS ...
Type: Performance and Optimization
09/19/2008
Performance Tools for Software Developers - Some Applications Built with -xP or /QxP Optimizations May Produce Runtime Error
Symptom(s): The following message may be displayed when a program built with the switches "-xP" (on Linux*) or "/QxP" (on Windows*) is run on a system with an Intel® Core™ 2 Duo processor or a ...
Type: Performance and Optimization
09/19/2008