Intel® Composer XE Suites

Leadership application performance on systems using Intel® Core™ or Xeon® or compatible processors

  • Includes Intel® Fortran, Intel® C++, Intel® MKL, Intel® TBB and Intel® IPP
  • Features powerful parallelism models to simplify multicore support
  • Compatible with leading compilers and development environments

From $1,199
Buy Now

Or Download a Free 30-Day Evaluation Version

Service Pack 1 Released - What's New

 

Great Application Performance, Serial or Parallel Programming

Intel Composer XE 2013 delivers outstanding performance for your applications as they run on systems using Intel® Core™ or Xeon® processors, including Intel® Xeon Phi™ coprocessors, and IA-compatible processors. It combines all the serial and parallel tools from Intel® C++ Composer XE 2013 with those from Intel® Fortran Composer XE 2013. Microsoft Visual Studio* 2008-2013 is a prerequisite on Windows and the gnu tool chain is supported on Linux and OS X*

 

Components (Vary by package, outlined below)


Intel® C++ Compiler
  • Industry leading C and C++ application performance
  • Compatible with leading compilers and development environments

Intel® Fortran Compiler 

  • Industry leading Fortran application performance
  • Compatible with leading compilers and development environments
  • Extensive support for Fortran standards, Open MP* and more
  • Intel resells the Rogue Wave* IMSL Fortran Numerical Library providing thousands of mathematical functions that have supported parallel architectures since 1990.

Intel® MKL
(C, C++, Fortran)

  • Vectorized and threaded for improved performance on Intel and compatible processors
  • De facto standard APIs for simple code integration
  • Compatible with all C, C++ and Fortran compilers
  • Royalty-free, per developer licensing for low cost deployment

Intel® IPP
(C++)

  • Performance: Pre-optimized Building Blocks Perform Faster
  • Time to Market: Intel Engineering Saves You Development Time
  • Cross Operating System: Windows, Linux, Mac, & Android
  • Cross Platform: Phone, Tablet, PC/Ultrabook, & Server

Intel® TBB
(C++)

  • Widely used C++ template library with rich set of components to efficiently implement higher-level, task-based parallelism
  • Future-proof applications to tap multicore and many-core power
  • Compatible with multiple compilers and portable to various operating systems

Intel® Cilk™ Plus
(C, C++)

  • Simplifies adding threading and vectorization to C/C++ applications that take advantage of processors/coprocessors with wide vectors and multiple cores.
  • 3 simple keywords for most common needs

You have a choice. Choose the package that suits your needs by operating system (Click on title or '+' button to expand):

  • Windows*

  • Package

    C++ compiler

    Fortran compiler

    Intel® MKL

    Intel® TBB

    Intel® IPP

    Intel® Cilk™ Plus

    IMSL* Library

    Development Environment

    Intel® Composer XE
    Buy Now 

    X

    X

    X

    X

    X

    X

     

    Visual Studio* 2008-2013

    Intel® C++ Composer XE
    Buy Now 

    X

     

    X

    X

    X

    X

     

    Visual Studio 2008-2013

    Intel® Visual Fortran Composer XE
    Buy Now 

     

    X

    X

     

     

     

     

    Visual Studio 2008-2013 and Visual Studio 2010 Shell*

    Intel® Visual Fortran Composer XE with IMSL*
    Buy Now 

     

    X

    X

     

     

     

    X

    Visual Studio 2008-2013 and Visual Studio 2010 Shell

  • Linux*

  • Package

    C++ compiler

    Fortran compiler

    Intel® MKL

    Intel® TBB

    Intel® IPP

    Intel® Cilk™ Plus

    Development Environment

    Intel® Composer XE
    Buy Now 

    X

    X

    X

    X

    X

    X

    Command Line

    Intel® C++ Composer XE
    Buy Now 

    X

     

    X

    X

    X

    X

    Command Line

    Intel® Fortran Composer XE
    Buy Now 

     

    X

    X

     

     

     

    Command Line

  • OS X*
  • Package

    C++ compiler

    Fortran compiler

    Intel® MKL

    Intel® TBB

    Intel® IPP

    Intel® Cilk™ Plus

    Development Environment

    Intel® C++ Composer XE
    Buy Now 

    X

     

    X

    X

    X

    X

    Command Line

    Intel® Fortran Composer XE
    Buy Now 

     

    X

    X

     

     

     

    Command Line

Performance tools included in Intel® Composer XE 2013 SP1

Intel® Cilk™ Plus

Intel Cilk Plus is part of Intel C++ Composer and is a powerful capability for increasing C++ application performance. It features array notation that simplifies vectorization, supports simplification of elemental functions declarations and extends the C++ language with 3 easy-to-use keywords to streamline task- and data-parallelism implementation. The benefit is that you save time in producing readable, maintainable, scalable code that delivers impressive performance benefits by taking advantage of underlying hardware features such as wider vectors and more processing cores.

Intel Cilk Plus Array Notation

The two boxes below show simple and more sophisticated examples of array notation. Each specifies an array section using a set of 3 numbers, either variable or literal, in an array syntax separated by colons. The first is the lower bound where the array section starts, the second is the length of the array, and the third is the stride used to select items from the array.

The first example has a lower bound of 0, an array section length of N and an unspecified stride which defaults to 1. In this example, A0 is set to the product of B0 and C0, A1 is set to the product of B1 and C1, etc.

Array notation showing simple vector multiplication
a[0:N] = b[0:N] * c[0:N];

The second shows a more sophisticated example in which the 10th item of the array X, up to X of 100, is assigned a sin of every other alternating item of the array Y from 20 to 40.

More sophisticated example of array notation
X[0:10:10] = sin(y[20:10:2]);

In this example, the compiler knows that the call to the sin function can be done safely in parallel via vector operations. This enables the safe generation of vector code.

Intel Cilk Plus Elemental Function Support

If you have a function that consists of operations to scalar data that follow certain guidelines, you can declare it an elemental function using the __declspec vector function notation. Here’s an example:

Example of elemental function syntax
__declspec(vector) int foo(int x) {
Return(x+1);
}
for(int I = 0; I < size; i++)
array[i]=foo(array[i]);

The Cilk Plus implementation makes it easy to consistently use the syntactic notation for the function definition and any function declaration. The example above shows how to use the Intel compiler so that it generates vector code to call multiple elements of foo in an array at a given point in time. The result? Improved performance.

Click here for additional C/C++ code samples used in real-world applications including Rendering (AOBench), Image Processing (Mandelbrot, Sepia Filter, Averaging Filter, Discrete Cosine Transforms), Finance (Monte Carlo, Black-Scholes, Binomial Lattice) and RTM Stencil.

Task and Data Parallelism: Intel Cilk Plus Keywords

Intel Cilk Plus supports task and data parallelism, making it easier to take advantage of more processing power in multicore systems. The benefit is improved application performance that scales.

Inserting keywords into existing code offers a simple, fast, readable, and maintainable way to take advantage of multi-core systems. For task parallelism, two keywords, as shown in the code sample below, tell the application where to start and end parallel functionality. Note that the sample code remains unchanged except for the insertion of the keywords.

Serial code (left) made parallel with Intel Cilk Plus Keywords. No changes to original code.
int fib (int n)
{
 if (n <= 2)
     return n;
 else {
     int x,y;
     x = fib(n-1);
     y = fib(n-2);
     return x+y;
   }
}
int fib (int n)
{
 if (n <= 2)
     return n;
 else {
      int x,y;
     x = _Cilk_spawn fib(n-1);
     y = fib(n-2);
     _Cilk_sync;
     return x+y;
   }
}

Intel® Performance Libraries

Intel® Composer XE 2013 is a lot more than a Fortran and C++ compiler. It includes three powerful, function libraries that offer the easiest way in which to add parallelism-based performance to your application software. The three libraries are:

  • Intel® Math Kernel Library, also known as Intel® MKL
  • Intel® Integrated Performance Primitives, also known as Intel® IPP
  • Intel® Threading Building Blocks, also known as Intel® TBB

Intel® MKL

Intel® Math Kernel Library (Intel® MKL) 11.1 is a computing math library of highly optimized, extensively threaded math routines for applications that require maximum performance. Core math functions include BLAS, LAPACK, ScaLAPACK1, sparse solvers, fast Fourier transforms, vector math, and more.

Offering performance optimizations for current and next-generation Intel® processors, it includes improved integration with Microsoft Visual Studio*, Eclipse*, and XCode*. The Intel® MKL computing math library allows for full integration of the Intel® Compatibility OpenMP* runtime library for greater Windows*/Linux* cross-platform compatibility.

Benefits

  • Outstanding performance - multicore and multiprocessor ready
  • Automatic parallelization
  • Standard APIs in C and Fortran
  • Royalty free redistribution
  • World-class technical support, knowledge base, and active Intel® MKL forum

Click here for more details on Intel MKL.

Intel® IPP

Intel® Integrated Performance Primitives 8.0 is an extensive library of software functions to help you develop multimedia, data processing, and communications applications for Windows*, Linux*, and OS X* environments.

Benefits

  • Pre-optimized algorithmic primitives deliver time-to-market performance
  • Future-proofs your code by receiving future processor optimizations by simply re-linking
  • Thousands of algorithmic performance primitives covering many SW domains.
  • Royalty free redistribution
  • Source code samples to help jumpstart your application development
IPP PerformancePerformance Comparison Details
Click to enlarge ippiFilter Performance an order of magnitude faster than an optimized compiler with further improvements over multiple generations of SSE and AVX instruction sets.
Click to enlarge ippsSqrt32f Intel® Compiler vectorization makes a major difference in Sqrt performance alone, but IPP provides a further 8x performance boost over the Intel® Compiler.

Click here for more details on Intel IPP.

Intel® Threading Building Blocks (Intel® TBB) 4.2

Parallelize Applications for Performance

Intel® Threading Building Blocks (Intel® TBB) 4.2 is a widely used, award-winning C++ template library for creating high performance, scalable parallel applications. Intel® TBB is the most proficient way to implement future-proof parallel applications to harness the power and performance of multicore and many-core hardware platforms.

Benefits

  • Enhance Productivity and Reliability - Rich set of components to efficiently implement higher-level, task-based parallelism
  • Gain Performance Advantage Today and Tomorrow - Future-proof applications to tap multicore and many-core power
  • Fits Within Your Environment - Compatible with multiple compilers and portable to various operating systems

Click here for more details on Intel® TBB.

Intel Performance Features

Common to Both Intel Fortran and Intel C++

The High Performance Parallel Optimizer (HPO) offers an improved ability to analyze, optimize, and parallelize more loop nests. This revolutionary capability combines vectorization, parallelization, and loop transformations into a single pass that is faster, more effective, and more reliable than prior discrete phases.

Automatic Vectorization analyzes loops and determines when it is safe and effective to execute several iterations of the loop in parallel. Utilization of auto-vectorization/auto-parallelization depends on the structure of your application software but is broadly applicable and can deliver outstanding improvements in application performance.

Guided Auto Parallelization (GAP) can come into play when you have code that will ‘almost’ vectorize or parallelize with the auto-vectorization and parallelism capabilities. Using GAP is like doing a build except the output is a report about your specific application in which suggestions are made, which, if implemented, could lead to improved performance. You retain complete control of your code as you implement suggestions, which you can verify are safe to do.

High Performance Parallel Optimizer (HPO) offers an improved ability to analyze, optimize, and parallelize more loop nests. This revolutionary capability combines vectorization, parallelization, and loop transformations into a single pass that is faster, more effective, and more reliable than prior discrete phases.

Interprocedural Optimization (IPO) dramatically improves performance of small- or medium-sized functions that are used frequently, especially programs that contain calls within loops.

Loop Profiler is part of the compiler and can be used to generate low overhead loop and function profiling to show hotspots and where to introduce threads.

Profile-Guided Optimization (PGO) improves application performance by reducing instruction-cache thrashing, reorganizing code layout, shrinking code size, and reducing branch mispredictions.

OpenMP 4.0 is supported to help simplify pragma-based development of parallelism in your C, C++ and Fortran applications.

Intel C++

The Performance Guide is a Windows-based, interactive tool that helps you improve performance for your Windows-based C++ applications. It requires that you have Intel® VTune Amplifier XE 2013, or Intel® C++ Studio XE 2013, or the Intel® Parallel Studio XE 2013 also installed.

The Performance Guild walks you through an easy to use workflow, telling you where you can apply Intel C++ performance techniques and quickly see the results. The two-fold benefit is great productivity in getting improved performance.

The Performance Guide workflow window (below) shows the steps in the workflow and keeps track of completed steps. The Performance Guide window (bottom right) provides instructions for the current step. The Check Performance Measurements window (top right) shows how performance has improved after applying each of the Intel C++ performance techniques.

The Performance Guide is a great way to add application performance and an excellent reason to consider the purchase of Intel C++ Studio XE 2013 or Intel® VTune™ Amplifier XE 2013.

The Performance Guide makes suggestions that you implement by using your familiar Windows* Visual Studio* environment. There are no new tools to learn.

Intel Fortran

Intel Fortran includes a number of performance-oriented features which vectorize and parallelize code. The compiler also supports the wider vectors in the latest Intel Architecture and compatible processors. There is also GAP, which suggests ways to get more benefit from parallelization and vectorization while keeping you in control of your code. More info on performance options below!

Performance Features, such as Automatic Vectorization are included in Intel Fortran Composer XE 2013
subroutine quad(len,a,b,c,x1,x2)
  real(4) a(len),b(len), c(len), x1(len), x2(len), s

  do i=1,len
    s = b(i)**2 - 4.*a(i)*c(i)
    if (s.ge.0.) then
      x1(i) = sqrt(s)
      x2(i) = (-x1(i) - b(i)) *0.5 / a(i)
      x1(i) = ( x1(i) - b(i)) *0.5 / a(i)
    else
      x2(i)=0.
      x1(i)=0.
    endif
  enddo
end

> ifort -c -vec-report2 quad.f90
quad.f90(4): (col. 3) remark: LOOP WAS VECTORIZED.

Intel Fortran Composer XE 2013 delivers advanced capabilities for application performance optimization, including development of parallelism for the full range of systems based on IA-32 and Intel® 64 architecture processors (including compatible processors) as well as the Intel® Xeon Phi™ coprocessors.

Intel Fortran Composer XE 2013 includes support for coarray Fortran on multi-cpu shared-memory nodes. Cluster support is available in Intel® Cluster Studio XE 2013. Other Fortran 2008 features include DO CONCURRENT, CONTIGUOUS, I/O enhancements, and intrinsic functions, a set of which includes matrix multiply intrinsic functions that support calls into Intel MKL. Fortran 2003 support also provides complete type-bound procedures such as GENERIC and OPERATOR. Support for Fortran 2003 features such as object-orientation, type-bound procedures and operators, and C++ interoperability continue to make it easier to develop mixed-language applications. Intel Fortran interacts nicely with C++11 and C99 features in the Intel® C++ Compiler.

What’s New

Performance Leadership:

  • New Intel processor support – 3rd Generation Intel® Core™ Processors (Intel® microarchitecture code name Ivy Bridge) and Intel® microarchitecture code name Haswell.
  • Intel® Xeon Phi™ coprocessor optimizations
  • Support for Intel® Xeon Phi™ coprocessors on Windows*-based systems
  • Innovative Intel® Cilk™ Plus now even more performance-oriented

New Product Capabilities:

  • Latest OS: Windows* 8 Desktop, Linux* including Windows OS on Intel® Xeon Phi™
  • Standards: C99, selected parts of C++11, almost complete Fortran 2003 support, selected features from Fortran 2008, and nearly complete support for Open MP* 4.0”

New Features:

  • Pointer Checker: Memory debugging & security tool
  • Performance Guide (C++ Windows only. Requires VTune Amplifier XE 2013 or Intel® C++ Studio XE 2013)
  • Extended Eigensolver routines in Intel® MKL handle larger problem sizes and use less memory
  • Conditional Numerical Reproducibility in Intel® MKL without need to align memory
  • Expanded support in Intel® MKL for automatic offload and load balancing of computations from Intel® Xeon® processors to Intel® Xeon Phi™ coprocessors

What’s New

Performance Leadership:

  • New Intel processor support – 3rd Generation Intel® Core™ Processors (Intel® microarchitecture code name Ivy Bridge) and Intel® microarchitecture code name Haswell.
  • Intel® Xeon Phi™ coprocessor optimizations
  • Support for Intel® Xeon Phi™ coprocessors on Windows*-based systems
  • Innovative Intel® Cilk™ Plus now even more performance-oriented

New Product Capabilities

  • Latest environment support: Windows* 8 Desktop and Linux distributions. Windows support includes support for Visual Studio* 2013.
  • Standards: C99, selected parts of C++11, almost complete Fortran 2003 support, selected features from Fortran 2008, and nearly complete support for Open MP* 4.0

New Features:

  • Pointer Checker: Memory debugging & security tool
  • Performance Guide (C++ Windows only. Requires VTune Amplifier XE 2013 or Intel® C++ Studio XE 2013)
  • Extended Eigensolver routines in Intel® MKL handle larger problem sizes and use less memory
  • Conditional Numerical Reproducibility in Intel® MKL without need to align memory
  • Expanded support in Intel® MKL for automatic offload and load balancing of computations from Intel® Xeon® processors to Intel® Xeon Phi™ coprocessors

C++ Performance Features

Common to Both Intel Fortran and Intel C++

The High Performance Parallel Optimizer (HPO) offers an improved ability to analyze, optimize, and parallelize more loop nests. This revolutionary capability combines vectorization, parallelization, and loop transformations into a single pass that is faster, more effective, and more reliable than prior discrete phases.

Automatic Vectorization analyzes loops and determines when it is safe and effective to execute several iterations of the loop in parallel. Utilization of auto-vectorization/auto-parallelization depends on the structure of your application software but is broadly applicable and can deliver outstanding improvements in application performance.

Guided Auto Parallelization (GAP) can come into play when you have code that will ‘almost’ vectorize or parallelize with the auto-vectorization and parallelism capabilities. Using GAP is like doing a build except the output is a report about your specific application in which suggestions are made, which, if implemented, could lead to improved performance. You retain complete control of your code as you implement suggestions, which you can verify are safe to do.

High Performance Parallel Optimizer (HPO) offers an improved ability to analyze, optimize, and parallelize more loop nests. This revolutionary capability combines vectorization, parallelization, and loop transformations into a single pass that is faster, more effective, and more reliable than prior discrete phases.

Interprocedural Optimization (IPO) dramatically improves performance of small- or medium-sized functions that are used frequently, especially programs that contain calls within loops.

Loop Profiler is part of the compiler and can be used to generate low overhead loop and function profiling to show hotspots and where to introduce threads.

Profile-Guided Optimization (PGO) improves application performance by reducing instruction-cache thrashing, reorganizing code layout, shrinking code size, and reducing branch mispredictions.

OpenMP 4.0 is supported to help simplify pragma-based development of parallelism in your C, C++ and Fortran applications.

Intel C++ Compiler

The Performance Guide is a Windows-based, interactive tool that helps you improve performance for your Windows-based C++ applications. It requires that you have Intel® VTune Amplifier XE 2013, or Intel® C++ Studio XE 2013, or the Intel® Parallel Studio XE 2013 also installed.

The Performance Guild walks you through an easy to use workflow, telling you where you can apply Intel C++ performance techniques and quickly see the results. The two-fold benefit is great productivity in getting improved performance.

The Performance Guide workflow window (below) shows the steps in the workflow and keeps track of completed steps. The Performance Guide window (bottom right) provides instructions for the current step. The Check Performance Measurements window (top right) shows how performance has improved after applying each of the Intel C++ performance techniques.

The Performance Guide is a great way to add application performance and an excellent reason to consider the purchase of Intel C++ Studio XE 2013 or Intel® VTune™ Amplifier XE 2013.

The Performance Guide makes suggestions that you implement by using your familiar Windows* Visual Studio* environment. There are no new tools to learn.

Intel® Cilk™ Plus

Intel Cilk Plus is part of Intel C++ Composer and is a powerful capability for increasing C++ application performance. It features array notation that simplifies vectorization, supports simplification of elemental functions declarations and extends the C++ language with 3 easy-to-use keywords to streamline task- and data-parallelism implementation. The benefit is that you save time in producing readable, maintainable, scalable code that delivers impressive performance benefits by taking advantage of underlying hardware features such as wider vectors and more processing cores.

Intel Cilk Plus Array Notation

The two boxes below show simple and more sophisticated examples of array notation. Each specifies an array section using a set of 3 numbers, either variable or literal, in an array syntax separated by colons. The first is the lower bound where the array section starts, the second is the length of the array, and the third is the stride used to select items from the array.

The first example has a lower bound of 0, an array section length of N and an unspecified stride which defaults to 1. In this example, A0 is set to the product of B0 and C0, A1 is set to the product of B1 and C1, etc.

Array notation showing simple vector multiplication.
a[0:N] = b[0:N] * c[0:N];

The second shows a more sophisticated example in which the 10th item of the array X, up to X of 100, is assigned a sin of every other alternating item of the array Y from 20 to 40.

More sophisticated example of array notation
X[0:10:10] = sin(y[20:10:2]);

In this example, the compiler knows that the call to the sin function can be done safely in parallel via vector operations. This enables the safe generation of vector code.

Intel Cilk Plus Elemental Function Support

If you have a function that consists of operations to scalar data that follow certain guidelines, you can declare it an elemental function using the __declspec vector function notation. Here’s an example:

Example of elemental function syntax
__declspec(vector) int foo(int x) {
Return(x+1);
}
for(int I = 0; I < size; i++)
array[i]=foo(array[i]);

The Cilk Plus implementation makes it easy to consistently use the syntactic notation for the function definition and any function declaration. The example above shows how to use the Intel compiler so that it generates vector code to call multiple elements of foo in an array at a given point in time. The result? Improved performance.

Click here for additional C/C++ code samples used in real-world applications including Rendering (AOBench), Image Processing (Mandelbrot, Sepia Filter, Averaging Filter, Discrete Cosine Transforms), Finance (Monte Carlo, Black-Scholes, Binomial Lattice) and RTM Stencil.

Task and Data Parallelism: Intel Cilk Plus Keywords

Intel Cilk Plus supports task and data parallelism, making it easier to take advantage of more processing power in multicore systems. The benefit is improved application performance that scales.

Inserting keywords into existing code offers a simple, fast, readable, and maintainable way to take advantage of multi-core systems. For task parallelism, two keywords, as shown in the code sample below, tell the application where to start and end parallel functionality. Note that the sample code remains unchanged except for the insertion of the keywords.

Serial code (left) made parallel with Intel Cilk Plus Keywords. No changes to original code.
int fib (int n)
{
 if (n <= 2)
     return n;
 else {
     int x,y;
     x = fib(n-1);
     y = fib(n-2);
     return x+y;
   }
}
int fib (int n)
{
 if (n <= 2)
     return n;
 else {
      int x,y;
     x = _Cilk_spawn fib(n-1);
     y = fib(n-2);
     _Cilk_sync;
     return x+y;
   }
}

Intel® Performance Libraries

Intel® C++ Composer XE 2013 is a lot more than a Fortran and C++ compiler. It includes three powerful, function libraries that offer the easiest way in which to add parallelism-based performance to your application software. The three libraries are:

  • Intel® Math Kernel Library, also known as Intel® MKL
  • Intel® Integrated Performance Primitives, also known as Intel® IPP
  • Intel® Threading Building Blocks, also known as Intel® TBB

Intel® Math Kernel Library (Intel® MKL) 11.1

Intel Math Kernel Library 11.1 is a computing math library of highly optimized, extensively threaded math routines for applications that require maximum performance. Core math functions include BLAS, LAPACK, ScaLAPACK1, sparse solvers, fast Fourier transforms, vector math, and more.

Offering performance optimizations for current and next-generation Intel® processors, it includes improved integration with Microsoft Visual Studio*, Eclipse*, and XCode*. The Intel® MKL computing math library allows for full integration of the Intel® Compatibility OpenMP* runtime library for greater Windows*/Linux* cross-platform compatibility.

Benefits

  • Outstanding performance - multicore and multiprocessor ready
  • Automatic parallelization
  • Standard APIs in C and Fortran
  • Royalty free redistribution
  • World-class technical support, knowledge base, and active Intel® MKL forum

Click here for more details on Intel MKL.

Intel® Integrated Performance Primitives (Intel® IPP) 8.0

Intel® Integrated Performance Primitives (Intel® IPP) 8.0 is an extensive library of software functions to help you develop multimedia, data processing, and communications applications for Windows*, Linux*, and OS X* environments.

Benefits

  • Pre-optimized algorithmic primitives deliver time-to-market performance
  • Future-proofs your code by receiving future processor optimizations by simply re-linking
  • Thousands of algorithmic performance primitives covering many SW domains.
  • Royalty free redistribution
  • Source code samples to help jumpstart your application development
IPP PerformancePerformance Comparison Details
Click to enlarge ippiFilter Performance an order of magnitude faster than an optimized compiler with further improvements over multiple generations of SSE and AVX instruction sets.
Click to enlarge ippsSqrt32f Intel® Compiler vectorization makes a major difference in Sqrt performance alone, but IPP provides a further 8x performance boost over the Intel® Compiler.

Click here for more details on Intel IPP.

Intel® Threading Building Blocks (Intel® TBB) 4.2

Parallelize Applications for Performance

Intel® Threading Building Blocks (Intel® TBB) 4.2 is a widely used, award-winning C++ template library for creating high performance, scalable parallel applications. Intel® TBB is the most proficient way to implement future-proof parallel applications to harness the power and performance of multicore and many-core hardware platforms.

Benefits

  • Enhance Productivity and Reliability - Rich set of components to efficiently implement higher-level, task-based parallelism
  • Gain Performance Advantage Today and Tomorrow - Future-proof applications to tap multicore and many-core power
  • Fits Within Your Environment - Compatible with multiple compilers and portable to various operating systems

Click here for more details on Intel® TBB.

Fortran Performance Features

Common to Both Intel Fortran and Intel C++

The High Performance Parallel Optimizer (HPO) offers an improved ability to analyze, optimize, and parallelize more loop nests. This revolutionary capability combines vectorization, parallelization, and loop transformations into a single pass that is faster, more effective, and more reliable than prior discrete phases.

Automatic Vectorization analyzes loops and determines when it is safe and effective to execute several iterations of the loop in parallel. Utilization of auto-vectorization/auto-parallelization depends on the structure of your application software but is broadly applicable and can deliver outstanding improvements in application performance.

Guided Auto Parallelization (GAP) can come into play when you have code that will ‘almost’ vectorize or parallelize with the auto-vectorization and parallelism capabilities. Using GAP is like doing a build except the output is a report about your specific application in which suggestions are made, which, if implemented, could lead to improved performance. You retain complete control of your code as you implement suggestions, which you can verify are safe to do.

High Performance Parallel Optimizer (HPO) offers an improved ability to analyze, optimize, and parallelize more loop nests. This revolutionary capability combines vectorization, parallelization, and loop transformations into a single pass that is faster, more effective, and more reliable than prior discrete phases.

Interprocedural Optimization (IPO) dramatically improves performance of small- or medium-sized functions that are used frequently, especially programs that contain calls within loops.

Loop Profiler is part of the compiler and can be used to generate low overhead loop and function profiling to show hotspots and where to introduce threads.

Profile-Guided Optimization (PGO) improves application performance by reducing instruction-cache thrashing, reorganizing code layout, shrinking code size, and reducing branch mispredictions.

OpenMP 4.0 is supported to help simplify pragma-based development of parallelism in your C, C++ and Fortran applications.

Intel Fortran

Intel Fortran includes a number of performance-oriented features which vectorize and parallelize code. The compiler also supports the wider vectors in the latest Intel Architecture and compatible processors. There is also GAP, which suggests ways to get more benefit from parallelization and vectorization while keeping you in control of your code. More info on performance options below!

Performance Features, such as Automatic Vectorization are included in Intel Fortran Composer XE
subroutine quad(len,a,b,c,x1,x2)
  real(4) a(len),b(len), c(len), x1(len), x2(len), s

  do i=1,len
    s = b(i)**2 - 4.*a(i)*c(i)
    if (s.ge.0.) then
      x1(i) = sqrt(s)
      x2(i) = (-x1(i) - b(i)) *0.5 / a(i)
      x1(i) = ( x1(i) - b(i)) *0.5 / a(i)
    else
      x2(i)=0.
      x1(i)=0.
    endif
  enddo
end

> ifort -c -vec-report2 quad.f90
quad.f90(4): (col. 3) remark: LOOP WAS VECTORIZED.

Intel Fortran Composer XE 2013 delivers advanced capabilities for application performance optimization, including development of parallelism for the full range of systems based on IA-32 and Intel® 64 architecture processors (including compatible processors) as well as the Intel® Xeon Phi™ coprocessors.

Intel Fortran Composer XE 2013 includes support for coarray Fortran on multi-cpu shared-memory nodes. Cluster support is available in Intel® Cluster Studio XE 2013. Other Fortran 2008 features include DO CONCURRENT, CONTIGUOUS, I/O enhancements, and intrinsic functions, a set of which includes matrix multiply intrinsic functions that support calls into Intel MKL. Fortran 2003 support also provides complete type-bound procedures such as GENERIC and OPERATOR. Support for Fortran 2003 features such as object-orientation, type-bound procedures and operators, and C++ interoperability continue to make it easier to develop mixed-language applications. Intel Fortran interacts nicely with C++11 and C99 features in the Intel® C++ Compiler.

Intel® Math Kernel Library (Intel® MKL) 11.1

Intel Math Kernel Library 11.1 is a computing math library of highly optimized, extensively threaded math routines for applications that require maximum performance. Core math functions include BLAS, LAPACK, ScaLAPACK1, sparse solvers, fast Fourier transforms, vector math, and more.

Offering performance optimizations for current and next-generation Intel® processors, it includes improved integration with Microsoft Visual Studio*, Eclipse*, and XCode*. The Intel® MKL computing math library allows for full integration of the Intel® Compatibility OpenMP* runtime library for greater Windows*/Linux* cross-platform compatibility.

Benefits

  • Outstanding performance - multicore and multiprocessor ready
  • Automatic parallelization
  • Standard APIs in C and Fortran
  • Royalty free redistribution
  • World-class technical support, knowledge base, and active Intel® MKL forum

Click here for more details on Intel MKL.

Intel® Performance Libraries & Parallel Programming Models

Intel® Composer XE 2013 is a lot more than a Fortran and C++ compiler. It includes three powerful, function libraries that offer the easiest way in which to add parallelism-based performance to your application software. The three libraries are:

  • Intel® Math Kernel Library, also known as Intel® MKL
  • Intel® Integrated Performance Primitives, also known as Intel® IPP
  • Intel® Threading Building Blocks, also known as Intel® TBB

Intel® Math Kernel Library (Intel® MKL) 11.1

Intel Math Kernel Library 11.1 is a computing math library of highly optimized, extensively threaded math routines for applications that require maximum performance. Core math functions include BLAS, LAPACK, ScaLAPACK1, sparse solvers, fast Fourier transforms, vector math, and more.

Offering performance optimizations for current and next-generation Intel® processors, it includes improved integration with Microsoft Visual Studio*, Eclipse*, and XCode*. The Intel® MKL computing math library allows for full integration of the Intel® Compatibility OpenMP* runtime library for greater Windows*/Linux* cross-platform compatibility.

Benefits

  • Outstanding performance - multicore and multiprocessor ready
  • Automatic parallelization
  • Standard APIs in C and Fortran
  • Royalty free redistribution
  • World-class technical support, knowledge base, and active Intel® MKL forum

Click here for more details on Intel MKL.

Intel® Integrated Performance Primitives (Intel® IPP) 8.0

Intel® Integrated Performance Primitives (Intel® IPP) 8.0 is an extensive library of software functions to help you develop multimedia, data processing, and communications applications for Windows*, Linux*, and OSX* environments.

Benefits

  • Pre-optimized algorithmic primitives deliver time-to-market performance
  • Future-proofs your code by receiving future processor optimizations by simply re-linking
  • Thousands of algorithmic performance primitives covering many SW domains.
  • Royalty free redistribution
  • Source code samples to help jumpstart your application development
IPP PerformancePerformance Comparison Details
Click to enlarge ippiFilter Performance an order of magnitude faster than an optimized compiler with further improvements over multiple generations of SSE and AVX instruction sets.
Click to enlarge ippsSqrt32f Intel® Compiler vectorization makes a major difference in Sqrt performance alone, but IPP provides a further 8x performance boost over the Intel® Compiler.

Click here for more details on Intel IPP.

Intel® Cilk™ Plus

Intel Cilk Plus is part of Intel C++ Composer and is a powerful capability for increasing C++ application performance. It features array notation that simplifies vectorization, supports simplification of elemental functions declarations and extends the C++ language with 3 easy-to-use keywords to streamline task- and data-parallelism implementation. The benefit is that you save time in producing readable, maintainable, scalable code that delivers impressive performance benefits by taking advantage of underlying hardware features such as wider vectors and more processing cores.

Intel Cilk Plus Array Notation

The two boxes below show simple and more sophisticated examples of array notation. Each specifies an array section using a set of 3 numbers, either variable or literal, in an array syntax separated by colons. The first is the lower bound where the array section starts, the second is the length of the array, and the third is the stride used to select items from the array.

The first example has a lower bound of 0, an array section length of N and an unspecified stride which defaults to 1. In this example, A0 is set to the product of B0 and C0, A1 is set to the product of B1 and C1, etc.

Array notation showing simple vector multiplication.
a[0:N] = b[0:N] * c[0:N];

The second shows a more sophisticated example in which the 10th item of the array X, up to X of 100, is assigned a sin of every other alternating item of the array Y from 20 to 40.

More sophisticated example of array notation
X[0:10:10] = sin(y[20:10:2]);

In this example, the compiler knows that the call to the sin function can be done safely in parallel via vector operations. This enables the safe generation of vector code.

Intel Cilk Plus Elemental Function Support

If you have a function that consists of operations to scalar data that follow certain guidelines, you can declare it an elemental function using the __declspec vector function notation. Here’s an example:

Example of elemental function syntax
__declspec(vector) int foo(int x) {
Return(x+1);
}
for(int I = 0; I < size; i++)
array[i]=foo(array[i]);

The Cilk Plus implementation makes it easy to consistently use the syntactic notation for the function definition and any function declaration. The example above shows how to use the Intel compiler so that it generates vector code to call multiple elements of foo in an array at a given point in time. The result? Improved performance

Click here for additional C/C++ code samples used in real-world applications including Rendering (AOBench), Image Processing (Mandelbrot, Sepia Filter, Averaging Filter, Discrete Cosine Transforms), Finance (Monte Carlo, Black-Scholes, Binomial Lattice) and RTM Stencil.

Task and Data Parallelism: Intel Cilk Plus Keywords

Intel Cilk Plus supports task and data parallelism, making it easier to take advantage of more processing power in multicore systems. The benefit is improved application performance that scales.

Inserting keywords into existing code offers a simple, fast, readable, and maintainable way to take advantage of multi-core systems. For task parallelism, two keywords, as shown in the code sample below, tell the application where to start and end parallel functionality. Note that the sample code remains unchanged except for the insertion of the keywords.

Serial code (left) made parallel with Intel Cilk Plus Keywords. No changes to original code.
int fib (int n)
{
 if (n <= 2)
     return n;
 else {
     int x,y;
     x = fib(n-1);
     y = fib(n-2);
     return x+y;
   }
}
int fib (int n)
{
 if (n <= 2)
     return n;
 else {
      int x,y;
     x = _Cilk_spawn fib(n-1);
     y = fib(n-2);
     _Cilk_sync;
     return x+y;
   }
}

Intel® Threading Building Blocks (Intel® TBB) 4.2

Parallelize Applications for Performance

Intel® Threading Building Blocks (Intel® TBB) 4.2 is a widely used, award-winning C++ template library for creating high performance, scalable parallel applications. Intel® TBB is the most proficient way to implement future-proof parallel applications to harness the power and performance of multicore and many-core hardware platforms.

Benefits

  • Enhance Productivity and Reliability - Rich set of components to efficiently implement higher-level, task-based parallelism
  • Gain Performance Advantage Today and Tomorrow - Future-proof applications to tap multicore and many-core power
  • Fits Within Your Environment - Compatible with multiple compilers and portable to various operating systems

Click here for more details on Intel® TBB.

Click on images for a larger view of the benchmark graphic.


Linear Algebra Performance Charts


DGEMM
DGEMM Performance Benchmark

Intel® Optimized SMP LINPACK
Intel® Optimized SMP LINPACK Benchmark

HPL LINPACK
HPL LINPACK performance benchmark

LU Factorization
LU Factorization Performance Benchmark

Cholesky Factorization
Cholesky Factorization Benchmark


FFT Performance Charts


2D and 3D FFTs on Intel® Xeon and Intel® Core Processors
Cluster FFT Performance Benchmark

Cluster FFT Performance
Cluster FFT Performance Benchmark

Cluster FFT Scalability
Cluster FFT Scalability Benchmark


Sparse BLAS and Sparse Solver

Performance Charts



Data Fitting Performance Charts


DCSRGEMV and DCSRMM
DCSRGEMV and DCSRMM performance benchmark

PARDISO Sparse Solver
PARDISO Sparse Solver performance benchmark

Natural cubic spline construction and interpolation
Natural cubic spline construction and interpolation Performance Benchmark


Random Number Generator Performance Charts



Vector Math Performance Chart



Application Benchmark Performance Chart


MCG31m1
Random Number Generator Performance Benchmark

VML exp()
VML exp() Function Performance Benchmark

Monte-Carlo option pricing performance benchmark
Monte-Carlo option pricing performance benchmark

Click on images for a larger view of the benchmark graphic.


Linear Algebra Performance Charts


Intel® Optimized SMP LINPACK
DGEMM Performance Benchmark

LU Factorization
LU Factorization Performance Benchmark

QR Factorization
QR Factorization Performance Benchmark

HPL LINPACK
HPL LINPACK

Cholesky Factorization
Cholesky Factorization Performance Benchmark

Matrix Multiply
Matrix Multiply Performance Benchmark


Application Benchmark Performance Chart



Batch 1D FFT Performance Chart



Black- Scholes Chart           


Monte Carlo Option Pricing
Monte Carlo Option Pricing Performance Benchmark

 
Batch 1D FFT Performance Chart

 
Black- Scholes Performance Benchmark

Click on images for a larger view of the benchmark graphic.

Performance Benchmarks

Performance ChartPerformance Comparison Details
ippiFilter

ippiFilter

Performance is significantly faster than an optimized compiler with further improvements over multiple generations of SSE and Intel® AVX instruction sets.

ippsSqrt32f

ippsSqrt32f

Intel® Compiler vectorization makes a major difference in Sqrt performance alone, but Intel IPP provides significant performance boost over the Intel® Compiler.

    • Where can I get an evaluation copy of Intel® Composer XE 2013?
    • Right here! Windows* and Linux*

    • How do I tell Visual Studio* to use the Intel® C++ Compiler?
    • One easy way to do this is to load your solution into Visual Studio and then right-click over the solution name in the Solution Explorer. You’ll see a popup. Near the bottom you’ll see “Intel Composer XE 2013.” Mouse over that and click “Use Intel C++.”

    • Where is the sample code that ships with Intel Composer XE 2013?
    • On Windows, if you used the default installation paths, you can find the sample code in the following directory: Local Disk (C:)/Program Files (x86)/Intel/Composer XE 2013/Samples/en_US/C++. The Fortran sample code is in the same directory …/ en_US/Fortran.

      Note that all sample code is installed in zip files. When you’re ready to use them, we suggest you unzip sample file contents to a separate directory. This preserves the sample contents for later use. When you try to load some sample code, Visual Studio might walk you through a project conversion. Just click through this and your sample code will be loaded into Visual Studio!

    • Where is the Intel® Composer XE 2013 “Getting Started” tutorial?
    • Click here!

    • I’m trying to use Intel® Cilk™ Plus, Intel® Math Kernel Library (Intel® MKL), Intel® Integrated Performance Primitives (Intel® IPP) or Intel® Threading Building Blocks (Intel® TBB). I put the keywords, syntax and library calls into my source but I get errors. What’s up?
    • For more information, take advantage of Intel Forums and the Knowledge Base and check product documentation. But here’s something that might be helpful: make sure you included the proper header files in your projects.

    Intel® Composer XE 2013

    Getting Started?

    Click the Learn tab for guides and links that will quickly get you started.

    Get Help or Advice

    Search Support Articles
    Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
    Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
    Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

    Resources

    Release Notes (Intel® C++ Composer XE | Intel® Fortran Composer XE) - View Release Notes online!
    Intel® Composer XE documentation - Intel® C++ Composer XE | Intel® Fortran Composer XE
    Documentation for other software products
     

    Intel® Math Kernel Library 11.1

    Getting Started?

    Click the Learn tab for guides and links that will quickly get you started.

    Get Help or Advice

    Search Support Articles
    Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
    Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
    Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

    Resources

    Release Notes - View Release Notes online!
    Fixes List - View Compiler Fixes List

    Documentation:
    Reference Manual
    Linux* | Windows* | OS X*
    Documentation for other software products

    Intel® Integrated Performance Primitives

    Getting Started?

    Click the Learn tab for guides and links that will quickly get you started.

    Get Help or Advice

    Search Support Articles
    Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
    Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
    Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.
    Supported Versions

    Resources

    Release Notes - View Release Notes online!
    Documentation:
    Reference Manual
    User's Guide
    All Product Documentation - View documentation online!

    Intel® Threading Building Blocks 4.1

    Getting Started?

    Click the Learn tab for guides and links that will quickly get you started.

    Get Help or Advice

    Search Support Articles
    Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
    Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
    Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

    Resources

    Release Notes - View Release Notes online!
    Product Documentation - View documentation online!

    Intel® C++ Compiler for Linux*

    Getting Started?

    Click the Learn tab for guides and links that will quickly get you started.

    Get Help or Advice

    Search Support Articles
    Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
    Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
    Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

    Resources

    Release Notes - View Release Notes online!
    Fixes List - View Compiler Fixes List
    Checksums - View Product Checksums
    Product Documentation - View documentation online!

    Intel® C++ Compiler for Windows*

    Getting Started?

    Click the Learn tab for guides and links that will quickly get you started.

    Get Help or Advice

    Search Support Articles
    Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
    Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
    Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

    Resources

    Release Notes - View Release Notes online!
    Fixes List - View Compiler Fixes List
    Checksums - View Product Checksums
    Product Documentation - View documentation online!

    Intel® C++ Compiler for OS X*

    Getting Started?

    Click the Learn tab for guides and links that will quickly get you started.

    Get Help or Advice

    Search Support Articles
    Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
    Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
    Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

    Resources

    Release Notes - View Release Notes online!
    Fixes List - View Compiler Fixes List
    Checksums - View Product Checksums
    Product Documentation - View documentation online!

    Intel® Visual Fortran Compiler

    Getting Started?

    Click the Learn tab for guides and links that will quickly get you started.

    Get Help or Advice

    Search Support Articles
    Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
    Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
    Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

    Resources

    Release Notes - View Release Notes online!
    Fixes List - View Compiler Fixes List
    Checksums - View Product Checksums
    Product Documentation - View documentation online!

    Featured Support Topics

    未找到内容

    Intel® Fortran Compiler for Linux*

    Getting Started?

    Click the Learn tab for guides and links that will quickly get you started.

    Get Help or Advice

    Search Support Articles
    Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
    Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
    Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

    Resources

    Release Notes - View Release Notes online!
    Fixes List - View Compiler Fixes List
    Checksums - View Product Checksums
    Product Documentation - View documentation online!

    Intel® Fortran Compiler for OS X*

    Getting Started?

    Click the Learn tab for guides and links that will quickly get you started.

    Get Help or Advice

    Search Support Articles
    Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
    Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
    Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

    Resources

    Release Notes - View Release Notes online!
    Fixes List - View Compiler Fixes List
    Checksums - View Product Checksums
    Product Documentation - View documentation online!