Intel® C and C++ Compilers

Leadership application performance

  • Rich set of components to efficiently implement higher-level, task-based parallelism
  • Future-proof applications to tap multicore and many-core power
  • Compatible with multiple compilers and portable to various operating systems

Performance without compromise

  • Industry leading performance on Intel and compatible processors.
  • Extensive optimizations for the latest Intel processors, including Intel® Xeon Phi™ coprocessor
  • Scale forward with support multi-core, manycore and multiprocessor systems with OpenMP, automatic parallelism, and Intel Xeon Phi coprocessor support
  • Patented automatic CPU dispatch feature gets you code optimized for the current running processor runs code optimized for specified processors identified at application runtime.
  • Intel® Performance Guide provides suggestions for improving performance in your Windows* applications.

Broad support for current and previous C and C++ standards, plus popular extensions

  • Language support with full C++11 and most C99 support. For details on C++11, see http://software.intel.com/en-us/articles/c0x-features-supported-by-intel-c-compiler
  • Extensive OpenMP 4.0* support

Faster, more scalable applications with advanced parallel models and libraries

Intel provides a variety of scalable, easy to use parallel models. These highly abstracted models and libraries simplify adding both task and vector parallelism. The end result is faster, more scalable applications running on multi-core and manycore architectures.

Intel® Cilk™ Plus (included with Intel C++ compiler)

  • Simplifies adding parallelism for performance with only three keywords
  • Scale for the future with runtime system operates smoothly on systems with hundreds of cores.
  • Vectorized and threaded for highest performance on all Intel and compatible processors
  • Click here for sample code, contributed libraries, open specifications and other information from the Cilk Plus community.
  • Included with Intel C++ compiler and available in GCC 4.9 development branch (with –fcilkplus and the caveat that Cilk_for is not supported yet in a Clang*/LLVM* project at http://cilkplus.github.io/.
  • More information

OpenMP 4.0 (included with Intel C++ compiler)

  • Support for most of the new features in the OpenMP* 4.0 API Specification (user-defined reductions not yet supported)
  • Support for C, C++, and Fortran OpenMP programs on Windows*, Linux*, and OS X*
  • Complete support for industry-standard OpenMP pragmas and directives in the OpenMP 3.1 API Specification
  • Intel-specific extensions to optimize performance and verify intended functionality
  • Intel compiler OpenMP libraries are object-level compatible with Microsoft Visual C++* on Windows and GCC on Linux*

Intel® Math Kernel Library

  • Vectorized and threaded for highest performance using de facto standard APIs for simple code integration
  • C, C++ and Fortran compiler-compatible with royalty-free licensing for low cost deployment
  • More information

Intel® Integrated Performance Primitives

  • Performance: Pre-optimized building blocks for compute-intensive tasks
  • A consistent set of APIs that support multiple operating systems and architectures
    • Windows*, Linux*, Android*, and OS X*
    • Intel® Quark™, Intel® Atom™, Intel® Core™, Intel® Xeon®, and Intel® Xeon Phi™ processors
  • More information

Intel® Threading Building Blocks

  • Rich set of components to efficiently implement higher-level, task-based parallelism
  • Compatible with multiple compilers and portable to various operating systems
  • More information

Intel® Media SDK 2014 for Clients

  • A cross-platform API for developing consumer and professional media applications.
  • Intel® Quick Sync Video: Hardware-accelerated video encoding, decoding, and transcoding.
  • Development Efficiency: Code once now and see it work on tomorrow's platforms.
  • More information

A drop-in addition for C and C++ development

  • Windows*
    • Develop, build, debug and run from the familiar Visual Studio IDE
    • Works with Microsoft Visual Studio* 2008, 2010, 2012 and 2013
    • Source and binary compatible with Visual C++*
  • Linux*
    • Develop, build, debug and run using Eclipse* IDE interface or command line
    • Source and binary compatible with GCC
  • OS X*
    • Develop, build, debug and run from the familiar Xcode* IDE
    • Works with Xcode 4.6, 5.0 and 5.1
    • Source and binary compatible with LLVM-GCC and Clang* tool chains
  • 32-bit and 64-bit development included

  1. Project and source in Visual Studio
  2. C/C++ aware text editor
  3. Debug C/C++ code
  4. Call Stack information
  5. Set breakpoints at certain source lines on IDE.

Outstanding support

One year of support included with purchase – gives you access to all product updates and new versions released in the support period plus access to Intel Premier Support. There's a very active user forum for help from experienced users and Intel engineers

  • Videos on Getting Started with Intel® C++ Compiler
  • Vectorization Essentials
  • Performance Essentials with OpenMP 4.0 Vectorization
  • View slides

Register for future Webinars


Previously recorded Webinars:

  • Update Now: What’s New in Intel® Compilers and Libraries
  • Performance essentials using OpenMP* 4.0 vectorization with C/C++
  • Intel® Cilk™ Plus Array Notation - Technology and Case Study Beta
  • OpenMP 4.0 for SIMD and Affinity Features with Intel® Xeon® Processors and Intel® Xeon Phi™ Coprocessor
  • Introduction to Vectorization using Intel® Cilk™ Plus Extensions
  • Optimizing and Compilation for Intel® Xeon Phi™ Coprocessor

Featured Articles

Kein Inhalt gefunden

More Tech Articles

Common Vectorization Tips
Von AmandaS (Intel)Veröffentlicht am 10/07/20132
Compiler Methodology for Intel® MIC Architecture Common Vectorization Tips Handling user-defined function-calls inside vector-loops If you want to vectorize a loop that has a user-defined function call, (possibly re-factor the code and) make the function-call a vector-elemental function. Spec...
Data Alignment to Assist Vectorization
Von Rakesh Krishnaiyer (Intel)Veröffentlicht am 09/07/20134
Compiler Methodology for Intel® MIC Architecture Data Alignment to Assist Vectorization Overview Data alignment is a method to force the compiler to create data objects in memory on specific byte boundaries. This is done to increase efficiency of data loads and stores to and from the processo...
Large Page Considerations
Von Rakesh Krishnaiyer (Intel)Veröffentlicht am 09/06/20130
Compiler Methodology for Intel® MIC Architecture Large Page Considerations   Use THP enabled by default in the MPSS Operating System: MPSS versions later than 2.1.4982-15 support “Transparent Huge Pages (THP)” which automatically promotes 4K pages to 2MB pages for stack and heap allocated dat...
Improving Averaging Filter performance using Intel® Cilk™ Plus
Von Anoop Madhusoodhanan Prabha (Intel)Veröffentlicht am 07/25/20130
Intel® Cilk™ Plus is an extension to the C and C++ languages to support data and task parallelism.  It provides three new keywords to implement task parallelism and  Array Notation, simd pragma and Elemental Function to express data parallelism.  This article demonstrates how to improve the perfo...
Intel Developer Zone Beiträge abonnieren

Supplemental Documentation

Kein Inhalt gefunden
Intel Developer Zone Beiträge abonnieren

You can reply to any of the forum topics below by clicking on the title. Please do not include private information such as your email address or product serial number in your posts. If you need to share private information with an Intel employee, they can start a private thread for you.

New topic    Search within this forum     Subscribe to this forum


Can't select individual files to use Intel C++ in VS2012, only whole solution
Von Alex K.1
So here's the issue I'm facing: No available option to select Intel C++ on individual file: Or on grouping: But it IS available for the whole solution: If I make a new project though, I can select Intel C++ for individual files. Could this be because of a botched install/integration to VS2012?
Building Qt 5 with Intel C++ under Windows
Von Carsten S.1
Hi everybody! Has anyone succeeded in building Qt 5 with the Intel C++ compiler on Windows lately? Over the past few months, I have tried to build several versions of Qt 5, including the latest 5.4.0, but haven't succeeded to compile any of them using Intel C++. At least the Qt 5.4.0 build currently doesn't break in Qt itself, but in the 3rd-party library iAccessible2, which largely consists of source files auto-generated from a COM IDL (at least as far as I understand). I have attached the corresponding compiler messages to this post. Since I originally come from the Un*x world, I am not an expert in these technologies, though. I am using Intel C++ Compiler XE 15.0 (IA-32) under Windows 7 with Visual Studio 2013 (Community Edition). I know that Intel C++ is not among the "officially supported" platforms of Qt -- maybe it can't be done at all, but I would be happy to hear whether I'm actually the only one trying to use Qt with Intel C++ under Windows (or maybe the one unsuccessful o...
Intel Parallel Studio XE 2015 Update1 for Windows - Installation Problem
Von Hanfeng C.2
Hello, When I tried to install the latest Intel Parallel Studio XE 2015 Update 1 on my laptop, it got stuck in loading files (please see the attached picture). I should kill 'setup.exe' and 'chklic.exe' processes to terminate the installation. My laptop is equipped with Win7 64bit, Intel CPU i7-3610QM. I report here in case you have not noticed the problem. After I realized it was impossible to install the latest version, I downloaded an old one, Intel Parallel Stuio XE 2013 SP1. It was installed without any problem and it worked well with my Visual Studio 2010 on my laptop. I have tried Cilk Plus code in the IDE with Intel C++ compiler. Thanks for making Cilk Plus available. Best, Hanfeng Chen  
icpc 15 - Compiling Intrinsics
Von Daniele S.5
Hi All, I installed the latest release of icpc for Linux (v15) and it seems that intrinsics header files are missing (typically in <composer>/compiler/include). If I try to compile avx code including immintrin.h, I get a whole bunch of errors similar to the following: /usr/lib/gcc/x86_64-redhat-linux/4.4.7/include/avxintrin.h(1158): error: identifier "__builtin_ia32_movmskps256" is undefined which means that icpc is including gcc's header file and fails to recognize gcc's builtin functions. Could anyone help with this? Thanks, Daniele
error MSB6006: "icl.exe" exited with code 2.
Von dnesteruk2
Out of the blue I'm getting the following error "1>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\Platforms\x64\PlatformToolsets\Intel C++ Compiler XE 15.0\Toolset.targets(357,5): error MSB6006: "icl.exe" exited with code 2." I've attached full project and appreciate any help you can give.
icpc: loop condition false inside the loop
Von Dhairya M.4
I have been trying to debug a code for the last two weeks, and I have narrowed it down to this simple code which reproduces the issue. #include <cassert> template <class T> class Vector{ public: Vector(){dim=0;} ~Vector(){}; inline int Dim() const{return dim;} private: int dim; }; int main(int argc, char **argv){ Vector<int> v; #pragma omp parallel for for(int trg=0;trg<v.Dim();trg++){ assert(trg<v.Dim()); Vector<int> vbuff[2]; } return 0; }I compile the code with icpc version 12.1.6 as follows: icpc -O0 -openmp main.cpp When I run the executable, I get:     a.out: main.cpp:17: int main(int, char **): Assertion `trg<v.Dim()' failed. Can someone help me figure out what is going on? Additional details: OS is Scientific Linux release 6.6 (Carbon). CPU is Intel(R) Xeon(R) CPU E5-2687W 0 @ 3.10GHz
": internal error: 010101_1
Von saran.t3
I'm trying to compile Chombo on Mac OS X using the Intel compiler, however mcpcom fails after ~10 seconds, giving only the following error message ": internal error: 010101_1 This occurs whenever I try to compile at -O2 or above, but the problem disappears at -O1. Obviously this message is not very helpful to me at all as I have no idea what exactly is causing this. The fully-preprocessed source file can be found here. The exact compiler version is: icpc version 15.0.1 (gcc version 4.9.0 compatibility) The version of the LLVM/Clang toolchain used to setup the environment is: Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn) Target: x86_64-apple-darwin14.0.0 Thread model: posix ----- In case it helps, I also tried to compile the same file on Linux. The problem does NOT occur there. The compiler version on the Linux machine is: icpc version 15.0.1 (gcc version 4.3.0 compatibility) The gcc toolchain is: Using built-in specs. Target: x86_64-suse-linux Configured with: ....
Invalid handling of newlines in raw string literals
Von Gerd K.7
Given the following test code compiled with Intel C++ Compiler XE 14.0 #include "stdafx.h" #include <iostream> using namespace std; int _tmain(int argc, _TCHAR* argv[]) { auto raw = LR"(1 2)"; std::wcout << raw << endl; return 0; }the actual output on the console is 1\n2expected output 1 2Is this a bug in the compiler?
Foren abonnieren

You can reply to any of the forum topics below by clicking on the title. Please do not include private information such as your email address or product serial number in your posts. If you need to share private information with an Intel employee, they can start a private thread for you.

New topic    Search within this forum     Subscribe to this forum


Cilk plus implicit threshold
Von Guilherme R.1
Hi, I'm new to cilk, and i wanted to ask if it has an implicit threshold for the task creation, in recursive computations like fib? If so, is it based on the number of tasks created, or in the depth of the computation?   Thanks!
How to make this reduction in Cilk Plus?
Von Ioannis E. Venetis10
Hello, I have code that is structured like this: float A[3], X[M], Y[M], Z[M], OUTX[N], OUTY[N], OUTZ[N]; for (i = 0; i < N; i++) { // Use other arrays and i as an index to these arrays to initialize A[0], A[1], A[2] for (j = 0; j < M; j++) { // Calculate new values for A[0], A[1], A[2] // using more arrays where i and/or j are used as indexes X[j] += A[0]; Y[j] += A[1]; Z[j] += A[2]; } OUTX[i] = A[0]; OUTY[i] = A[1]; OUTZ[i] = A[2]; }I have successfully parallelized the outer loop using OpenMP, making the array A private and adding the atomic directive before the updates to the elements of X, Y and Z (using critical was actually worse). But now I would like to try this code out using Cilk Plus. Although I have read all the documentation about reducers and reduction operations in Cilk Plus, I still cannot formulate in my mind how the above code could be implemented in Cilk Plus. I would like to replace the outer loop with a cilk_for and have ...
simple cilk_spawn Segmentation Fault
Von Chris Szalwinski1
I'm having difficulty running a simple test case using cilk_spawn.  I'm compiling under gcc 4.9.0 20130520. The following fib2010.cpp example, executes in 0.028s without cilk and takes 0.376s with cilk as long as I set the number of workers to 1.  If I change the number of workers to any number greater than one, I get a segmentation fault. // fib2010.1.cpp // #include <iostream> #include <cilk/cilk.h> #include <cilk/cilk_api.h> int fib(int n) { if (n < 2) return n; int x = cilk_spawn fib(n-1); int y = fib(n-2); cilk_sync; return x + y; } int main(int argc, char* argv[]) { std::cout << "No of workers = " << __cilkrts_get_nworkers() << std::endl; int n = 32; std::cout << "fib(" << n << ") = " << fib(n) << std::endl; }  The hardware is Dual Core AMD Opteron 8220.
cilk_for segmentation fault
Von Chris Szalwinski5
Hi, I'm having difficulty comparing cilk_for with cilk_spawn.  The following cilk_spawn code executes as I expect for command line arguments like 1000000 30 // Recursive Implementation of Map // r_map.3.cpp #include <iostream> #include <iomanip> #include <cstdlib> #include <ctime> #include <cmath> #include <cilk/cilk.h> const double pi = 3.14159265; template<typename T> class AddSin { T* a; T* b; public: AddSin(T* a_, T* b_) : a(a_), b(b_) {} void operator()(int i) { a[i] = b[i] + std::sin(pi * (double) i / 180.) + std::cos(pi * (double) i / 180.) + std::tan(pi * (double) i / 180.); } }; template <typename Func> void r_map(int low, int high, int grain, Func f) { if (high - low <= grain) for (int i = low; i < high; i++) f(i); else { int mid = low + (high - low) / 2; cilk_spawn r_map(low, mid, grain, f); } } int main(int argc, char** argv) { if (argc != 3) { std::cerr << "Incorrect number of a...
Floating Point ABI
Von Nick T.2
Hello I noticed in the latest CilkPlus ABI specification (https://www.cilkplus.org/sites/default/files/open_specifications/CilkPlu...), it says that the caller to the library must set the floating point flags (top of page 8). This is what the LLVM implementation of CilkPlus and its runtime do, but the current Intel version of the run-time has the code to save the floating point status registers that is in LLVM's code generator and not the runtime from the LLVM repository. Please could you tell me whether: a) The floating point status flags should be set/saved by the caller b) The floating point status flags should be set/saved by the callee c) There's something I've overlooked The ABI says: "/** * Architecture - specific floating point state. mxcsr and fpcsr should be * set when CILK_SETJMP is called in client code. Note that the Win64 * jmpbuf for the Intel64 architecture already contains this information * so there is no need to use these fields on that OS/architecture. */" T...
How can I parallelize implicit loop ?
Von Zvi Danovich (Intel)1
I have the loop, inside its body running the function with array member (dependent on loop index) as an argument, and returning one value. I can parallelized this loop by using cilk_for() operator instead of regular for() - and it is simple and works well.  This is explicit parallelization.  Instead of explicit loop instruction I can use Array Notation contruction (as shown below) - it is implicit loop. My routine is relatively long and complecs, and has Array Notation constructions inside, so it cannot be declared as a vector (elemental) one. When I use implicit loop - it is not parallelized, the run time is increased substantially. float foo(float f_in) {  float f_result;  // LONG computation containing CILK+ Array Notation operations  /////////////////////////////////////////////////////////  return f_result; } int main() {  float af_in[n], af_out[n]; // Explicit parallelized loop  cilk_for(int i=0; i<n; i++)   af_out[i] =  foo(af_in[i]); // Implicit non-parallelized l...
Patches or configure options to build the trunk on arm
Von Karim C.0
Hello,  I want to build the trunk on an embedded system supporting armv7 instructions. The build was accomplished without errors but cilk/cilk.h and libcilkrts weren't built. I checked out the patches available on the internet they do support non x86 architectures but I think just i386 not arm. Are there other patches or config options to add while building so that I get those libraries along with the build  Regards   
Array of Reducers - Possible in C?
Von Detector1
I was wondering if it is possible to create an array of reducers in C? I already read the documentation, but they use always only one reducer. However, how do I use Cilk reducers for an array with int or double values? Can you give  me a short example? Thanks in advance.
Foren abonnieren