This article is intended to assist developers new to the Intel compilers on how to get maximum performance on IA-32, Intel® 64 and Itanium® processor platforms on the Microsoft Windows*, Linux*, and Mac OS* X operating systems.
Answers to common questions customers have about Intel compilers are provided. The following sections discuss how to use the compilers and what to do if you encounter problems, including some troubleshooting techniques. This article also presents a high-level discussion of some of the available optimizations with tips for improving performance.
Does it compile the source?
First, see whether your source code will compile with the appropriate Intel compiler. On the Windows side, Intel® Parallel Composer and the Intel C++ Compiler for Windows* Professional Edition are source and object compatible with the Microsoft Visual C++* compiler, so they compile source code that the Microsoft Visual C++ compiler can compile. The Intel Visual Fortran Compiler for Windows is substantially source compatible with Compaq Visual Fortran* and conforms to the Fortran 95, Fortran 90, and Fortran 77 standards.
The Intel C++ Compilers for Linux and Mac OS X are object compatible with GNU* C and C++. The Intel Fortran Compilers for Linux and Mac OS X also conforms to the Fortran 95, Fortran 90, and Fortran 77 standards.
What if the compiler rejects the source?
If the compile fails, check your source code for unsupported language extensions. For example, if you are compiling a file with a GNU gcc language extension - an extension the Intel compiler does not support - the compiler issues a syntax error. Similarly, for Fortran, if you are compiling code that violates the Fortran 95, Fortran 90, or Fortran 77 standards or contains language extensions the compiler doesn't recognize, the Intel compiler issues a syntax error. The best way to solve this type of problem is to rewrite the source code, so that it either conforms to the standards or doesn't contain unsupported extensions. Do note, the Intel compiler may give errors for non-compliant source code even in cases where it may be accepted by other compilers.
Does the program run?
Once the application is built, typically the next step is to run it with a set of tests. The tests are run to ensure outputs are correct.
What if some tests fail?
If some tests fail, try compiling the files of the application being tested using /Od (Windows) or -O0 (Linux/Mac) to turn off the optimizer. Optimizations are discussed below.
If the test still fails using /Od, there likely is a problem in the source code. It is also possible that the compiler is generating incorrect code. Should that happen, please report the problem to Intel.
What optimization should I use?
The basic optimization switches of the Intel compilers are:
- -Od or -O0 (no optimizations)
- -O1 (optimize for speed while focusing on code size)
- -O2 (optimize for speed)
- -O3 (optimize for speed and perform aggressive optimizations).
It is recommended that you use -O2 optimizations if possible. The Intel compilers default to use -O2. Using -O3, the compiler performs aggressive optimizations, so make sure to run your application tests to ensure all your tests pass at -O3.
How do I target a particular processor?
What about advanced optimizations?
Interprocedural optimizations (-ipo and /Qipo) improve performance within a file and across a multi-file program. The optimizations performed include function inlining, interprocedural constant propagation, dead code elimination, and others.
Profile guided optimizations (-prof_use and /Qprof_use) can be used to improve the performance of a program by passing run-time information back to the compiler. This can be used to improve branch prediction, cache utilization, and make better choices of functions to inline.
It is always a good practice to test your application when using aggressive compiler optimizations like Interprocedural Optimizations and Profile Guided Optimizations. Issues with these optimizations are more difficult to debug than those with standard optimizations.
My Program Runs Successfully with /Od but Fails with /O2. What Should I Do?
If your program runs successfully with /Od but fails with /O2, the next step is to determine which files are causing the problem. Problematic files can then be compiled with -Od. A divide-and-conquer strategy would be beneficial here. First, compile half the files (for example files that start with a-m) with /O2 and the rest with /Od. If the program passes the tests, then the problem is somewhere in the files starting with n-z, and files a-m can be compiled with /O2. See the next section,"Should I worry about precision?"
Should I worry about precision?
When the optimizer is turned on, there may be a minor loss or gain of precision. For example, on IA-32, a double-precision floating-point value is stored as 80 bits in the x87 FPU registers, and intermediate calculations are carried out to this precision. When a value is stored to memory from the x87 FPU registers, it is rounded to declared precision. If your code is sensitive to slight variations in precision, its behavior may change under optimization. You can either add the -fp-model precise (in Linux/Mac) or /fp:precise (in Windows) switches to enforce IEEE precision, or you can rewrite the code. These switches may have a performance impact on your application. For more details please see the article, Consistency of Floating Point Results using the Intel® Compiler.
My Program Runs Slowly - What Should I Do?
The software version of the "80/20 rule" is that 80% of a program is spent in 20 %of the code. It is recommended that you obtain a performance analysis tool that will show you where exactly your program spends its time. Intel offers the VTune™ Performance Analyzer and Intel® Parallel Amplifier tools for this purpose. The analysis shows you exactly which lines of your program are taking most of the execution time and provides you with tips for improving your code.
Conclusion: What To Do If You Still Have Questions?
Intel wants you to be successful in your use of its compilers. We encourage you to submit any questions or issues in the appropriate compiler User Forum. However, if you believe you have a technical issue or need 1:1 attention, a year of product support through Intel Premier Support is included with all purchases, and Intel Premier Support is also provided as part of the 30-day evaluation process. Please register for support here, then once you have registered, please login here and submit your issue. A developer support engineer will get back to you within one business day.
When submitting a compile time problem, please submit a test case with your issue. In the case of a C++ issue, submit a pre-processed file using the /P (Windows) or -E (Linux/Mac OS X) options. Note that a preprocessed file will contain any code in your source file and also any included header files that's not eliminated by #if processing. If you don't submit a test case, we will attempt, where feasible, to construct one with the symptoms you provide, but reproducing these is often difficult. When submitting a run-time issue, it is preferred that you send us the function in your program that is failing along with a driver that calls the function and demonstrates the failure.
For performance related issues, wherever possible, please determine the source of the problem using a performance analyzer like the VTune Performance Analyzer. One thing to keep in mind: in general, the smaller the test case you submit the more quickly a support engineer and the engineering team will be able to address your problem.
In all cases, please give us your complete compiler options used or Visual Studio project, and linker options and runtime arguments/environment settings if applicable.
- Evaluate the Intel Compilers for free.
- Get more information about Intel Compilers, Intel® Parallel Composer, VTune™ Performance Analyzer and Intel® Parallel Amplifier Information available includes product overviews, case studies, getting started guides, and compatibility information.
- Join our developer community at the User Forums.
- Explore our Knowledge Base.