Compile moblin 2.1 kernel sources with Intel Compiler
Instructions to compile moblin 2.1 kernel sources with the Intel Compiler Author: Maximillian Domeika (Intel) Type: Performance and Optimization |
11/18/2009
|
Being Successful with the Intel® Compilers -- You Need to Know
Tips and techniques on using the Intel Compilers to maximize your application performance. Author: Brandon Hewitt (Intel),Vangal Venkatesh Type: Performance and Optimization |
11/02/2009
|
Putting -lm Before User Objects/Libraries on Link Line Can Impact Performance
Recommended linking model:
icc/icpc/ifort [user objs] [user libs] [sys libs]
Using -lm (the GNU math library) prior to user-created objects or libraries causes the GNU libm to be used instead of the Intel math library, impacting performance. Author: Brandon Hewitt (Intel) Type: Performance and Optimization |
10/14/2009
|
Building the GAMESS with Intel® Compilers, Intel® MKL and Intel® MPI on Linux
Introduction :This document explains how to build GAMESS using the Intel Software products:Intel® C++ Compiler for Linux,Intel® Fortran Compiler for LINUX,Intel® MKL,Intel® MPI for Linux.Version :GAMES ... Author: Kirill Mavrodiev (Intel) Type: Performance and Optimization |
08/26/2009
|
WPS V3.1.1 installation best known method for Linux with Intel® Fortran Compiler v. 11.1
Introduction :This document explains how to build the WRF Preprocessing System(WPS) v3.1.1 using the Intel® Fortran Compiler for Linux, for example, version 11.1.046.Version : v3.1.1Obtaining Source Co ... Author: Kirill Mavrodiev (Intel) Type: Performance and Optimization |
08/19/2009
|
WRF V3.1.1 installation best known method for Linux with Intel C++ and Fortran COMPILER v. 11.1
Introduction : This document explains how to build the Weather Research & Forecasting(WRF) v3.1.1 using the Intel C++ and Fortran Compilers for Linux, for example, version 11.1.046."WRF was devel ... Author: Kirill Mavrodiev (Intel) Type: Performance and Optimization |
08/13/2009
|
How to Compile for Intel® AVX
Use the Intel Compiler 11.1 with the switch /QxAVX (Windows*) or -xavx (Linux*) to compile applications for Intel® Advanced Vector Extensions (Intel® AVX). Author: Martyn Corden (Intel) Type: Performance and Optimization |
07/16/2009
|
Performance Tools for Software Developers - Loop blocking
Loop blocking is a combination of strip mining and loop interchange to enhance reuse of local data. It helps the nested loops that manipulate arrays and are too large to fit into the cache. The loop blocking allows reuse of the arrays by transforming the Type: Performance and Optimization |
07/13/2009
|
Performance Tools for Software Developers - Auto parallelization and /Qpar-threshold
The article describes effect of /Qpar-threshold option when doing auto parallelization with Intel C++ compiler. Author: Om Sachan (Intel) Type: Performance and Optimization |
07/13/2009
|
Intel® compiler options for SSE generation (SSE2, SSE3, SSSE3, SSE4) and processor-specific optimizations
Explains which Intel Compiler switches to use to target and optimize for a specific platform, cpu or processor. Type: Performance and Optimization |
07/13/2009
|
An Overview of the Parallelization Implementation Methods in Intel(R) C++ Compilers
Description of the various ways you can use Intel® C++ Compilers to enable your applications for multi-core and many-core. Author: Mark Sabahi (Intel) Type: Performance and Optimization |
06/19/2009
|
IA-32 and Intel®64 Processor Targeting Overview
The compiler supports many options that tune or optimize an application for different Intel and non-Intel processors. Differences are explained, and the switches /arch, /Qx..., /Qax... (Windows*) and -m, -x..., -ax... (Linux*, Mac OS* X) are recommended. Author: Martyn Corden (Intel) Type: Performance and Optimization |
04/06/2009
|
How to check Auto-vectorization
You may use Intel C++ compiler option /vec-report3 to generate vectorization report. The vectorizer reports vectorized and non-vectorized loops and any proven or assumed data dependences. Author: Om Sachan (Intel) Type: Performance and Optimization |
03/24/2009
|
OpenMP* Loops with Function Calls for Bounds May Not Parallelize
Loops with function calls as bounds (such as STL end() calls) may not be compiled into parallel even if OpenMP APIs are explicitly used. Author: Brandon Hewitt (Intel) Type: Performance and Optimization |
03/12/2009
|
Intel® Fortran Compiler - Automatic CPU Dispatch For Multiple CPU Types
In 11.0 a new schema for the processor targeting switches is introduced for easy of use and remembering. The new options /QaxSSE4.1, /QaxSSSE3 and /QaxSSE3 are equivalent to /QaxS, QaxT and / ... Type: Performance and Optimization |
03/01/2009
|
Disable movbe to Test Intel® Atom™ Processor Targeted Code on non-Intel® Atom™ Processor Platforms
For those looking to validate code targeted for the Intel® Atom™ processor on other Intel Architectures, a mechanism to disable generation of the movbe instruction is provided. Author: Brandon Hewitt (Intel) Type: Performance and Optimization |
02/20/2009
|
Requirements for Vectorizable Loops
Vectorization is one of many optimizations that are enabled by default in the latest Intel compilers. In order to be vectorized, loops must obey certain conditions, listed below. Some additional ways to help the compiler to vectorize loops are described. Author: Martyn Corden (Intel) Type: Performance and Optimization |
01/29/2009
|
HPCC-stream performance loss with the 11.0 compiler
The STREAM component of the HPCC benchmark suite runs more slowly with the initial release of the 11.0 compiler compared to version 10.1. This can be worked around by disabling function inlining. Author: Martyn Corden (Intel) Type: Performance and Optimization |
12/19/2008
|
Compiler 11.0 OpenMP programs exhibit high core usage
Compiler 11.0 C++ and Fortran OpenMP-enabled programs on Windows, Linux, and Mac OS X have high CPU utilization for all cores in a multi-core system, even if it is known that not all cores are being fully utilized. Setting KMP_BLOCKTIME has no effect. Author: pbkenned Type: Performance and Optimization |
11/20/2008
|
High Clocks Per Instruction Retired when vectorizing the loop.
Sometimes when we vectorize a loop, we get a high Clocks Per Instruction Retired (CPI) value. This happens when there is high bus utilization and the bus gets saturated. Type: Performance and Optimization |
11/18/2008
|
Performance Tools for Software Developers - SSE generation and processor-specific optimizations continue
Can I combine the processor values and target more than one processor?
How to generate optimized code for both Intel and AMD* architecture?
Where can I find more information on processor-specific optimizations? Type: Performance and Optimization |
11/06/2008
|
Ensuring Shared Library Uses Intel Math Functions
When linking shared libraries with the Intel compiler libraries statically, the shared library may call the GNU libm functions instead of the Intel math functions. Author: John O (Intel) Type: Performance and Optimization |
10/13/2008
|
Intel® C++ Compiler for Linux* - OpenMP* specification support
Intel® C++ / Fortran Compiler Version
OpenMP* Standard Version
10.0
2.5
9.1
2.5
9.0
2.0
8.x
3.3
7.x
2.0
Note: There's no new features added in OpenMP* 2.5, but some clarifications ... Type: Performance and Optimization |
09/19/2008
|
Intel® C++ Compiler - Consistent Use of Compiler Options in Compile/Link Phase
If you are compiling applications with a separate compile and link process, the optimization options in the compile/link phase should match, especially when using openmp, parallelization, vecto ... Type: Performance and Optimization |
09/19/2008
|
Intel® Fortran Compiler - Training courses
Intel offers training courses designed to help software developers become productive and to improve application performance with the Intel® C++ and Intel® Fortran Compilers for Windows*, Lin ... Type: Performance and Optimization |
09/19/2008
|