Intel® Composer XE 2013

Great Application Performance, Serial or Parallel Programming

Intel Composer XE 2013 delivers outstanding performance for your applications as they run on systems using Intel® Core™ or Xeon® processors, including Intel® Xeon Phi™ coprocessors, and IA-compatible processors. It combines all the serial and parallel tools from Intel® C++ Composer XE 2013 with those from Intel® Fortran Composer XE 2013. Visual Studio* 2008, 2010 or 2012 is a prerequisite on Windows and the gnu tool chain is supported on Linux and OS X*.

Performance tools included in Intel® Composer XE 2013

Intel® Cilk™ Plus

Intel Cilk Plus is part of Intel C++ and is a powerful capability for increasing C++ application performance. It features array notation that simplifies vectorization, supports simplification of elemental functions declarations and extends the C++ language with 3 easy-to-use keywords to streamline task- and data-parallelism implementation. The benefit is that you save time in producing readable, maintainable, scalable code that delivers impressive performance benefits by taking advantage of underlying hardware features such as wider vectors and more processing cores.

Intel Cilk Plus Array Notation

The two boxes below show simple and more sophisticated examples of array notation. Each specifies an array section using a set of 3 numbers, either variable or literal, in an array syntax separated by colons. The first is the lower bound where the array section starts, the second is the length of the array, and the third is the stride used to select items from the array.

The first example has a lower bound of 0, an array section length of N and an unspecified stride which defaults to 1. In this example, A0 is set to the product of B0 and C0, A1 is set to the product of B1 and C1, etc.

Array notation showing simple vector multiplication.
a[0:N] = b[0:N] * c[0:N];

The second shows a more sophisticated example in which the 10th item of the array X, up to X of 100, is assigned a sin of every other alternating item of the array Y from 20 to 40.

More sophisticated example of array notation
X[0:10:10] = sin(y[20:10:2]);

In this example, the compiler knows that the call to the sin function can be done safely in parallel via vector operations. This enables the safe generation of vector code.

Intel Cilk Plus Elemental Function Support

If you have a function that consists of operations to scalar data that follow certain guidelines, you can declare it an elemental function using the __declspec vector function notation. Here’s an example:

Example of elemental function syntax
__declspec(vector) int foo(int x) {
Return(x+1);
}
for(int I = 0; I < size; i++)
array[i]=foo(array[i]);

The Cilk Plus implementation makes it easy to consistently use the syntactic notation for the function definition and any function declaration. The example above shows how to use the Intel compiler so that it generates vector code to call multiple elements of foo in an array at a given point in time. The result? Improved performance

Task and Data Parallelism: Intel Cilk Plus Keywords

Intel Cilk Plus supports task and data parallelism, making it easier to take advantage of more processing power in multicore systems. The benefit is improved application performance that scales.

Inserting keywords into existing code offers a simple, fast, readable, and maintainable way to take advantage of multi-core systems. For task parallelism, two keywords, as shown in the code sample below, tell the application where to start and end parallel functionality. Note that the sample code remains unchanged except for the insertion of the keywords.

Serial code (left) made parallel with Intel Cilk Plus Keywords. No changes to original code.
int fib (int n)
{
 if (n <= 2)
     return n;
 else {
     int x,y;
     x = fib(n-1);
     y = fib(n-2);
     return x+y;
   }
}
int fib (int n)
{
 if (n <= 2)
     return n;
 else {
      int x,y;
     x = _Cilk_spawn fib(n-1);
     y = fib(n-2);
     _Cilk_sync;
     return x+y;
   }
}

Intel® Performance Libraries

Intel® Composer XE 2013 is a lot more than a Fortran and C++ compiler. It includes three powerful, function libraries that offer the easiest way in which to add parallelism-based performance to your application software. The three libraries are:

  • Intel® Math Kernel Library, also known as Intel® MKL
  • Intel® Integrated Performance Primitives, also known as Intel® IPP
  • Intel® Threading Building Blocks, also known as Intel® TBB

Intel® MKL

Intel® Math Kernel Library (Intel® MKL) 11.0 is a computing math library of highly optimized, extensively threaded math routines for applications that require maximum performance. Core math functions include BLAS, LAPACK, ScaLAPACK1, sparse solvers, fast Fourier transforms, vector math, and more.

Offering performance optimizations for current and next-generation Intel® processors, it includes improved integration with Microsoft Visual Studio*, Eclipse*, and XCode*. The Intel® MKL computing math library allows for full integration of the Intel® Compatibility OpenMP* runtime library for greater Windows*/Linux* cross-platform compatibility.

Benefits

  • Outstanding performance - multicore and multiprocessor ready
  • Automatic parallelization
  • Standard APIs in C and Fortran
  • Royalty free redistribution
  • World-class technical support, knowledge base, and active Intel® MKL forum

Click here more details on Intel MKL.

Intel® IPP

Intel® Integrated Performance Primitives 7.1 is an extensive library of software functions to help you develop multimedia, data processing, and communications applications for Windows*, Linux*, and OS X* environments.

Benefits

  • Pre-optimized algorithmic primitives deliver time-to-market performance
  • Future-proofs your code by receiving future processor optimizations by simply re-linking
  • Thousands of algorithmic performance primitives covering many SW domains.
  • Royalty free redistribution
  • Source code samples to help jumpstart your application development
IPP PerformancePerformance Comparison Details
Click to enlarge ippiFilter Performance an order of magnitude faster than an optimized compiler with further improvements over multiple generations of SSE and AVX instruction sets.
Click to enlarge ippsSqrt32f Intel® Compiler vectorization makes a major difference in Sqrt performance alone, but IPP provides a further 8x performance boost over the Intel® Compiler.

Click here more details on Intel IPP.

Intel® Threading Building Blocks (Intel® TBB) 4.1

Parallelize Applications for Performance

Intel® Threading Building Blocks (Intel® TBB) 4.1 is a widely used, award-winning C++ template library for creating high performance, scalable parallel applications. Intel® TBB is the most proficient way to implement future-proof parallel applications to harness the power and performance of multicore and many-core hardware platforms.

Benefits

  • Enhance Productivity and Reliability - Rich set of components to efficiently implement higher-level, task-based parallelism
  • Gain Performance Advantage Today and Tomorrow - Future-proof applications to tap multicore and many-core power
  • Fits Within Your Environment - Compatible with multiple compilers and portable to various operating systems

Click here more details on Intel® TBB.

Intel Performance Features

Common to Both Intel Fortran and Intel C++

The High Performance Parallel Optimizer (HPO) offers an improved ability to analyze, optimize, and parallelize more loop nests. This revolutionary capability combines vectorization, parallelization, and loop transformations into a single pass that is faster, more effective, and more reliable than prior discrete phases.

Automatic Vectorization analyzes loops and determines when it is safe and effective to execute several iterations of the loop in parallel. Utilization of auto-vectorization/auto-parallelization depends on the structure of your application software but is broadly applicable and can deliver outstanding improvements in application performance.

Guided Auto Parallelization (GAP) can come into play when you have code that will ‘almost’ vectorize or parallelize with the auto-vectorization and parallelism capabilities. Using GAP is like doing a build except the output is a report about your specific application in which suggestions are made, which, if implemented, could lead to improved performance. You retain complete control of your code as you implement suggestions, which you can verify are safe to do.

High Performance Parallel Optimizer (HPO) offers an improved ability to analyze, optimize, and parallelize more loop nests. This revolutionary capability combines vectorization, parallelization, and loop transformations into a single pass that is faster, more effective, and more reliable than prior discrete phases.

Interprocedural Optimization (IPO) dramatically improves performance of small- or medium-sized functions that are used frequently, especially programs that contain calls within loops.

Loop Profiler is part of the compiler and can be used to generate low overhead loop and function profiling to show hotspots and where to introduce threads.

Profile-Guided Optimization (PGO) improves application performance by reducing instruction-cache thrashing, reorganizing code layout, shrinking code size, and reducing branch mispredictions.

OpenMP 3.1 is supported to help simplify pragma-based development of parallelism in your C, C++ and Fortran applications.

Intel C++

The Performance Guide is a Windows-based, interactive tool that helps you improve performance for your Windows-based C++ applications. It requires that you have Intel® VTune Amplifier XE 2013, or Intel® C++ Studio XE 2013, or the Intel® Parallel Studio XE 2013 also installed.

The Performance Guild walks you through an easy to use workflow, telling you where you can apply Intel C++ performance techniques and quickly see the results. The two-fold benefit is great productivity in getting improved performance.

The Performance Guide workflow window (below) shows the steps in the workflow and keeps track of completed steps. The Performance Guide window (bottom right) provides instructions for the current step. The Check Performance Measurements window (top right) shows how performance has improved after applying each of the Intel C++ performance techniques.

The Performance Guide is a great way to add application performance and an excellent reason to consider the purchase of Intel C++ Studio XE 2013 or Intel® VTune™ Amplifier XE 2013.

The Performance Guide makes suggestions that you implement by using your familiar Windows* Visual Studio* environment. There are no new tools to learn.

Intel Fortran

Intel Fortran includes a number of performance-oriented features which vectorize and parallelize code. The compiler also supports the wider vectors in the latest Intel Architecture and compatible processors. There is also GAP, which suggests ways to get more benefit from parallelization and vectorization while keeping you in control of your code. More info on performance options below!

Performance Features, such as Automatic Vectorization are included in Intel Fortran Composer XE 2013
subroutine quad(len,a,b,c,x1,x2)
  real(4) a(len),b(len), c(len), x1(len), x2(len), s

  do i=1,len
    s = b(i)**2 - 4.*a(i)*c(i)
    if (s.ge.0.) then
      x1(i) = sqrt(s)
      x2(i) = (-x1(i) - b(i)) *0.5 / a(i)
      x1(i) = ( x1(i) - b(i)) *0.5 / a(i)
    else
      x2(i)=0.
      x1(i)=0.
    endif
  enddo
end

> ifort -c -vec-report2 quad.f90
quad.f90(4): (col. 3) remark: LOOP WAS VECTORIZED.

Intel Fortran Composer XE 2013 delivers advanced capabilities for application performance optimization, including development of parallelism for the full range of systems based on IA-32 and Intel® 64 architecture processors (including compatible processors) as well as the Intel® Xeon Phi™ coprocessors.

Intel Fortran Composer XE 2013 includes support for coarray Fortran on multi-cpu shared-memory nodes. Cluster support is available in Intel® Cluster Studio XE 2013. Other Fortran 2008 features include DO CONCURRENT, CONTIGUOUS, I/O enhancements, and intrinsic functions, a set of which includes matrix multiply intrinsic functions that support calls into Intel MKL. Fortran 2003 support also provides complete type-bound procedures such as GENERIC and OPERATOR. Support for Fortran 2003 features such as object-orientation, type-bound procedures and operators, and C++ interoperability continue to make it easier to develop mixed-language applications. Intel Fortran interacts nicely with C++11 and C99 features in the Intel® C++ Compiler.

What’s New

Performance Leadership:

  • New Intel processor support – 3rd Generation Intel® Core™ Processors (Intel® microarchitecture code name Ivy Bridge) and Intel® microarchitecture code name Haswell.
  • Intel® Xeon Phi™ coprocessor optimizations
  • Innovative Intel® Cilk™ Plus now even more performance-oriented

New Product Capabilities

  • Latest OS: Windows* 8 Desktop, Linux*
  • Standards: C99, selected parts of C++11, almost complete Fortran 2003 support and selected features from Fortran 2008, Open MP* 3.1

New Features:

  • Pointer Checker: Memory debugging & security tool
  • Performance Guide (C++ Windows only. Requires VTune Amplifier XE 2013 or Intel® C++ Studio XE 2013)

What’s New

Performance Leadership:

  • New Intel processor support – 3rd Generation Intel® Core™ Processors (Intel® microarchitecture code name Ivy Bridge) and Intel® microarchitecture code name Haswell.
  • Intel® Xeon Phi™ coprocessor optimizations
  • Innovative Intel® Cilk™ Plus now even more performance-oriented

New Product Capabilities

  • Latest OS: Windows* 8 Desktop, Linux*
  • Standards: C99, selected parts of C++11, almost complete Fortran 2003 support and selected features from Fortran 2008, Open MP* 3.1

New Features:

  • Pointer Checker: Memory debugging & security tool
  • Performance Guide (C++ Windows only. Requires VTune Amplifier XE 2013 or Intel® C++ Studio XE 2013)

C++ Performance Features

Common to Both Intel Fortran and Intel C++

The High Performance Parallel Optimizer (HPO) offers an improved ability to analyze, optimize, and parallelize more loop nests. This revolutionary capability combines vectorization, parallelization, and loop transformations into a single pass that is faster, more effective, and more reliable than prior discrete phases.

Automatic Vectorization analyzes loops and determines when it is safe and effective to execute several iterations of the loop in parallel. Utilization of auto-vectorization/auto-parallelization depends on the structure of your application software but is broadly applicable and can deliver outstanding improvements in application performance.

Guided Auto Parallelization (GAP) can come into play when you have code that will ‘almost’ vectorize or parallelize with the auto-vectorization and parallelism capabilities. Using GAP is like doing a build except the output is a report about your specific application in which suggestions are made, which, if implemented, could lead to improved performance. You retain complete control of your code as you implement suggestions, which you can verify are safe to do.

High Performance Parallel Optimizer (HPO) offers an improved ability to analyze, optimize, and parallelize more loop nests. This revolutionary capability combines vectorization, parallelization, and loop transformations into a single pass that is faster, more effective, and more reliable than prior discrete phases.

Interprocedural Optimization (IPO) dramatically improves performance of small- or medium-sized functions that are used frequently, especially programs that contain calls within loops.

Loop Profiler is part of the compiler and can be used to generate low overhead loop and function profiling to show hotspots and where to introduce threads.

Profile-Guided Optimization (PGO) improves application performance by reducing instruction-cache thrashing, reorganizing code layout, shrinking code size, and reducing branch mispredictions.

OpenMP 3.1 is supported to help simplify pragma-based development of parallelism in your C, C++ and Fortran applications.

Intel C++ Compiler

The Performance Guide is a Windows-based, interactive tool that helps you improve performance for your Windows-based C++ applications. It requires that you have Intel® VTune Amplifier XE 2013, or Intel® C++ Studio XE 2013, or the Intel® Parallel Studio XE 2013 also installed.

The Performance Guild walks you through an easy to use workflow, telling you where you can apply Intel C++ performance techniques and quickly see the results. The two-fold benefit is great productivity in getting improved performance.

The Performance Guide workflow window (below) shows the steps in the workflow and keeps track of completed steps. The Performance Guide window (bottom right) provides instructions for the current step. The Check Performance Measurements window (top right) shows how performance has improved after applying each of the Intel C++ performance techniques.

The Performance Guide is a great way to add application performance and an excellent reason to consider the purchase of Intel C++ Studio XE 2013 or Intel® VTune™ Amplifier XE 2013.

The Performance Guide makes suggestions that you implement by using your familiar Windows* Visual Studio* environment. There are no new tools to learn.

Intel® Cilk™ Plus

Intel Cilk Plus is part of Intel C++ and is a powerful capability for increasing C++ application performance. It features array notation that simplifies vectorization, supports simplification of elemental functions declarations and extends the C++ language with 3 easy-to-use keywords to streamline task- and data-parallelism implementation. The benefit is that you save time in producing readable, maintainable, scalable code that delivers impressive performance benefits by taking advantage of underlying hardware features such as wider vectors and more processing cores.

Intel Cilk Plus Array Notation

The two boxes below show simple and more sophisticated examples of array notation. Each specifies an array section using a set of 3 numbers, either variable or literal, in an array syntax separated by colons. The first is the lower bound where the array section starts, the second is the length of the array, and the third is the stride used to select items from the array.

The first example has a lower bound of 0, an array section length of N and an unspecified stride which defaults to 1. In this example, A0 is set to the product of B0 and C0, A1 is set to the product of B1 and C1, etc.

Array notation showing simple vector multiplication.
a[0:N] = b[0:N] * c[0:N];

The second shows a more sophisticated example in which the 10th item of the array X, up to X of 100, is assigned a sin of every other alternating item of the array Y from 20 to 40.

More sophisticated example of array notation
X[0:10:10] = sin(y[20:10:2]);

In this example, the compiler knows that the call to the sin function can be done safely in parallel via vector operations. This enables the safe generation of vector code.

Intel Cilk Plus Elemental Function Support

If you have a function that consists of operations to scalar data that follow certain guidelines, you can declare it an elemental function using the __declspec vector function notation. Here’s an example:

Example of elemental function syntax
__declspec(vector) int foo(int x) {
Return(x+1);
}
for(int I = 0; I < size; i++)
array[i]=foo(array[i]);

The Cilk Plus implementation makes it easy to consistently use the syntactic notation for the function definition and any function declaration. The example above shows how to use the Intel compiler so that it generates vector code to call multiple elements of foo in an array at a given point in time. The result? Improved performance

Task and Data Parallelism: Intel Cilk Plus Keywords

Intel Cilk Plus supports task and data parallelism, making it easier to take advantage of more processing power in multicore systems. The benefit is improved application performance that scales.

Inserting keywords into existing code offers a simple, fast, readable, and maintainable way to take advantage of multi-core systems. For task parallelism, two keywords, as shown in the code sample below, tell the application where to start and end parallel functionality. Note that the sample code remains unchanged except for the insertion of the keywords.

Serial code (left) made parallel with Intel Cilk Plus Keywords. No changes to original code.
int fib (int n)
{
 if (n <= 2)
     return n;
 else {
     int x,y;
     x = fib(n-1);
     y = fib(n-2);
     return x+y;
   }
}
int fib (int n)
{
 if (n <= 2)
     return n;
 else {
      int x,y;
     x = _Cilk_spawn fib(n-1);
     y = fib(n-2);
     _Cilk_sync;
     return x+y;
   }
}

Intel® Performance Libraries

Intel® C++ Composer XE 2013 is a lot more than a Fortran and C++ compiler. It includes three powerful, function libraries that offer the easiest way in which to add parallelism-based performance to your application software. The three libraries are:

  • Intel® Math Kernel Library, also known as Intel® MKL
  • Intel® Integrated Performance Primitives, also known as Intel® IPP
  • Intel® Threading Building Blocks, also known as Intel® TBB

Intel® Math Kernel Library (Intel® MKL) 11.0

Intel Math Kernel Library 11.0 is a computing math library of highly optimized, extensively threaded math routines for applications that require maximum performance. Core math functions include BLAS, LAPACK, ScaLAPACK1, sparse solvers, fast Fourier transforms, vector math, and more.

Offering performance optimizations for current and next-generation Intel® processors, it includes improved integration with Microsoft Visual Studio*, Eclipse*, and XCode*. The Intel® MKL computing math library allows for full integration of the Intel® Compatibility OpenMP* runtime library for greater Windows*/Linux* cross-platform compatibility.

Benefits

  • Outstanding performance - multicore and multiprocessor ready
  • Automatic parallelization
  • Standard APIs in C and Fortran
  • Royalty free redistribution
  • World-class technical support, knowledge base, and active Intel® MKL forum

Click here more details on Intel MKL.

Intel® Integrated Performance Primitives (Intel® IPP) 7.1

Intel® Integrated Performance Primitives (Intel® IPP) 7.1 is an extensive library of software functions to help you develop multimedia, data processing, and communications applications for Windows*, Linux*, and OS X* environments.

Benefits

  • Pre-optimized algorithmic primitives deliver time-to-market performance
  • Future-proofs your code by receiving future processor optimizations by simply re-linking
  • Thousands of algorithmic performance primitives covering many SW domains.
  • Royalty free redistribution
  • Source code samples to help jumpstart your application development
IPP PerformancePerformance Comparison Details
Click to enlarge ippiFilter Performance an order of magnitude faster than an optimized compiler with further improvements over multiple generations of SSE and AVX instruction sets.
Click to enlarge ippsSqrt32f Intel® Compiler vectorization makes a major difference in Sqrt performance alone, but IPP provides a further 8x performance boost over the Intel® Compiler.

Click here more details on Intel IPP.

Intel® Threading Building Blocks (Intel® TBB) 4.1

Parallelize Applications for Performance

Intel® Threading Building Blocks (Intel® TBB) 4.1 is a widely used, award-winning C++ template library for creating high performance, scalable parallel applications. Intel® TBB is the most proficient way to implement future-proof parallel applications to harness the power and performance of multicore and many-core hardware platforms.

Benefits

  • Enhance Productivity and Reliability - Rich set of components to efficiently implement higher-level, task-based parallelism
  • Gain Performance Advantage Today and Tomorrow - Future-proof applications to tap multicore and many-core power
  • Fits Within Your Environment - Compatible with multiple compilers and portable to various operating systems

Click here more details on Intel® TBB.

Fortran Performance Features

Common to Both Intel Fortran and Intel C++

The High Performance Parallel Optimizer (HPO) offers an improved ability to analyze, optimize, and parallelize more loop nests. This revolutionary capability combines vectorization, parallelization, and loop transformations into a single pass that is faster, more effective, and more reliable than prior discrete phases.

Automatic Vectorization analyzes loops and determines when it is safe and effective to execute several iterations of the loop in parallel. Utilization of auto-vectorization/auto-parallelization depends on the structure of your application software but is broadly applicable and can deliver outstanding improvements in application performance.

Guided Auto Parallelization (GAP) can come into play when you have code that will ‘almost’ vectorize or parallelize with the auto-vectorization and parallelism capabilities. Using GAP is like doing a build except the output is a report about your specific application in which suggestions are made, which, if implemented, could lead to improved performance. You retain complete control of your code as you implement suggestions, which you can verify are safe to do.

High Performance Parallel Optimizer (HPO) offers an improved ability to analyze, optimize, and parallelize more loop nests. This revolutionary capability combines vectorization, parallelization, and loop transformations into a single pass that is faster, more effective, and more reliable than prior discrete phases.

Interprocedural Optimization (IPO) dramatically improves performance of small- or medium-sized functions that are used frequently, especially programs that contain calls within loops.

Loop Profiler is part of the compiler and can be used to generate low overhead loop and function profiling to show hotspots and where to introduce threads.

Profile-Guided Optimization (PGO) improves application performance by reducing instruction-cache thrashing, reorganizing code layout, shrinking code size, and reducing branch mispredictions.

OpenMP 3.1 is supported to help simplify pragma-based development of parallelism in your C, C++ and Fortran applications.

Intel Fortran

Intel Fortran includes a number of performance-oriented features which vectorize and parallelize code. The compiler also supports the wider vectors in the latest Intel Architecture and compatible processors. There is also GAP, which suggests ways to get more benefit from parallelization and vectorization while keeping you in control of your code. More info on performance options below!

Performance Features, such as Automatic Vectorization are included in Intel Fortran Composer XE
subroutine quad(len,a,b,c,x1,x2)
  real(4) a(len),b(len), c(len), x1(len), x2(len), s

  do i=1,len
    s = b(i)**2 - 4.*a(i)*c(i)
    if (s.ge.0.) then
      x1(i) = sqrt(s)
      x2(i) = (-x1(i) - b(i)) *0.5 / a(i)
      x1(i) = ( x1(i) - b(i)) *0.5 / a(i)
    else
      x2(i)=0.
      x1(i)=0.
    endif
  enddo
end

> ifort -c -vec-report2 quad.f90
quad.f90(4): (col. 3) remark: LOOP WAS VECTORIZED.

Intel Fortran Composer XE 2013 delivers advanced capabilities for application performance optimization, including development of parallelism for the full range of systems based on IA-32 and Intel® 64 architecture processors (including compatible processors) as well as the Intel® Xeon Phi™ coprocessors.

Intel Fortran Composer XE 2013 includes support for coarray Fortran on multi-cpu shared-memory nodes. Cluster support is available in Intel® Cluster Studio XE 2013. Other Fortran 2008 features include DO CONCURRENT, CONTIGUOUS, I/O enhancements, and intrinsic functions, a set of which includes matrix multiply intrinsic functions that support calls into Intel MKL. Fortran 2003 support also provides complete type-bound procedures such as GENERIC and OPERATOR. Support for Fortran 2003 features such as object-orientation, type-bound procedures and operators, and C++ interoperability continue to make it easier to develop mixed-language applications. Intel Fortran interacts nicely with C++11 and C99 features in the Intel® C++ Compiler.

Intel® Math Kernel Library (Intel® MKL) 11.0

Intel Math Kernel Library 11.0  is a computing math library of highly optimized, extensively threaded math routines for applications that require maximum performance. Core math functions include BLAS, LAPACK, ScaLAPACK1, sparse solvers, fast Fourier transforms, vector math, and more.

Offering performance optimizations for current and next-generation Intel® processors, it includes improved integration with Microsoft Visual Studio*, Eclipse*, and XCode*. The Intel® MKL computing math library allows for full integration of the Intel® Compatibility OpenMP* runtime library for greater Windows*/Linux* cross-platform compatibility.

Benefits

  • Outstanding performance - multicore and multiprocessor ready
  • Automatic parallelization
  • Standard APIs in C and Fortran
  • Royalty free redistribution
  • World-class technical support, knowledge base, and active Intel® MKL forum

Click here more details on Intel MKL.

Intel® Performance Libraries & Parallel Programming Models

Intel® Composer XE 2013 is a lot more than a Fortran and C++ compiler. It includes three powerful, function libraries that offer the easiest way in which to add parallelism-based performance to your application software. The three libraries are:

  • Intel® Math Kernel Library, also known as Intel® MKL
  • Intel® Integrated Performance Primitives, also known as Intel® IPP
  • Intel® Threading Building Blocks, also known as Intel® TBB

Intel® Math Kernel Library (Intel® MKL) 11.0

Intel Math Kernel Library 11.0 is a computing math library of highly optimized, extensively threaded math routines for applications that require maximum performance. Core math functions include BLAS, LAPACK, ScaLAPACK1, sparse solvers, fast Fourier transforms, vector math, and more.

Offering performance optimizations for current and next-generation Intel® processors, it includes improved integration with Microsoft Visual Studio*, Eclipse*, and XCode*. The Intel® MKL computing math library allows for full integration of the Intel® Compatibility OpenMP* runtime library for greater Windows*/Linux* cross-platform compatibility.

Benefits

  • Outstanding performance - multicore and multiprocessor ready
  • Automatic parallelization
  • Standard APIs in C and Fortran
  • Royalty free redistribution
  • World-class technical support, knowledge base, and active Intel® MKL forum

Click here more details on Intel MKL.

Intel® Integrated Performance Primitives (Intel® IPP) 7.1

Intel® Integrated Performance Primitives (Intel® IPP) 7.1 is an extensive library of software functions to help you develop multimedia, data processing, and communications applications for Windows*, Linux*, and OSX* environments.

Benefits

  • Pre-optimized algorithmic primitives deliver time-to-market performance
  • Future-proofs your code by receiving future processor optimizations by simply re-linking
  • Thousands of algorithmic performance primitives covering many SW domains.
  • Royalty free redistribution
  • Source code samples to help jumpstart your application development
IPP PerformancePerformance Comparison Details
Click to enlarge ippiFilter Performance an order of magnitude faster than an optimized compiler with further improvements over multiple generations of SSE and AVX instruction sets.
Click to enlarge ippsSqrt32f Intel® Compiler vectorization makes a major difference in Sqrt performance alone, but IPP provides a further 8x performance boost over the Intel® Compiler.

Click here more details on Intel IPP.

Intel® Cilk™ Plus

Intel Cilk Plus is part of Intel C++ and is a powerful capability for increasing C++ application performance. It features array notation that simplifies vectorization, supports simplification of elemental functions declarations and extends the C++ language with 3 easy-to-use keywords to streamline task- and data-parallelism implementation. The benefit is that you save time in producing readable, maintainable, scalable code that delivers impressive performance benefits by taking advantage of underlying hardware features such as wider vectors and more processing cores.

Intel Cilk Plus Array Notation

The two boxes below show simple and more sophisticated examples of array notation. Each specifies an array section using a set of 3 numbers, either variable or literal, in an array syntax separated by colons. The first is the lower bound where the array section starts, the second is the length of the array, and the third is the stride used to select items from the array.

The first example has a lower bound of 0, an array section length of N and an unspecified stride which defaults to 1. In this example, A0 is set to the product of B0 and C0, A1 is set to the product of B1 and C1, etc.

Array notation showing simple vector multiplication.
a[0:N] = b[0:N] * c[0:N];

The second shows a more sophisticated example in which the 10th item of the array X, up to X of 100, is assigned a sin of every other alternating item of the array Y from 20 to 40.

More sophisticated example of array notation
X[0:10:10] = sin(y[20:10:2]);

In this example, the compiler knows that the call to the sin function can be done safely in parallel via vector operations. This enables the safe generation of vector code.

Intel Cilk Plus Elemental Function Support

If you have a function that consists of operations to scalar data that follow certain guidelines, you can declare it an elemental function using the __declspec vector function notation. Here’s an example:

Example of elemental function syntax
__declspec(vector) int foo(int x) {
Return(x+1);
}
for(int I = 0; I < size; i++)
array[i]=foo(array[i]);

The Cilk Plus implementation makes it easy to consistently use the syntactic notation for the function definition and any function declaration. The example above shows how to use the Intel compiler so that it generates vector code to call multiple elements of foo in an array at a given point in time. The result? Improved performance

Task and Data Parallelism: Intel Cilk Plus Keywords

Intel Cilk Plus supports task and data parallelism, making it easier to take advantage of more processing power in multicore systems. The benefit is improved application performance that scales.

Inserting keywords into existing code offers a simple, fast, readable, and maintainable way to take advantage of multi-core systems. For task parallelism, two keywords, as shown in the code sample below, tell the application where to start and end parallel functionality. Note that the sample code remains unchanged except for the insertion of the keywords.

Serial code (left) made parallel with Intel Cilk Plus Keywords. No changes to original code.
int fib (int n)
{
 if (n <= 2)
     return n;
 else {
     int x,y;
     x = fib(n-1);
     y = fib(n-2);
     return x+y;
   }
}
int fib (int n)
{
 if (n <= 2)
     return n;
 else {
      int x,y;
     x = _Cilk_spawn fib(n-1);
     y = fib(n-2);
     _Cilk_sync;
     return x+y;
   }
}

Intel® Threading Building Blocks (Intel® TBB) 4.1

Parallelize Applications for Performance

Intel® Threading Building Blocks (Intel® TBB) 4.1 is a widely used, award-winning C++ template library for creating high performance, scalable parallel applications. Intel® TBB is the most proficient way to implement future-proof parallel applications to harness the power and performance of multicore and many-core hardware platforms.

Benefits

  • Enhance Productivity and Reliability - Rich set of components to efficiently implement higher-level, task-based parallelism
  • Gain Performance Advantage Today and Tomorrow - Future-proof applications to tap multicore and many-core power
  • Fits Within Your Environment - Compatible with multiple compilers and portable to various operating systems

Click here more details on Intel® TBB.

Features

  • Intel® Composer XE 2013 features the acclaimed Intel Fortran and C++ Compilers. These performance-oriented compilers can deliver a remarkable performance improvements with just a simple recompile. By using more features, such as the vectorization and parallelism features of Intel® Cilk™ Plus, you might get even more!
  • Intel Fortran Composer XE 2013 includes support for co-array Fortran, providing support for single multi-cpu shared memory node. Cluster support is available in Intel® Cluster Studio XE 2013. Other Fortran 2008 features include DO CONCURRENT, CONTIGUOUS, I/O enhancements, and intrinsic functions, a set of which includes matrix multiply intrinsic functions that support calls into Intel® Math Kernel Library (Intel® MKL).
    is the easiest way to add parallelism to your C++ application components. It simplifies array notation, elemental function declaration and extends the Intel C++ compiler with 3 keywords that simplify implementing loop and task parallel functions. The benefit is that you save time in producing readable, maintainable, scalable code for IA-32 architecture, Intel® 64 architecture, compatible AMD processors, and, on Linux, the Intel® MIC architecture that delivers impressive performance benefits by taking advantage of underlying hardware features such as wider vectors and more processing cores.
  • Intel® Threading Building Blocks (Intel® TBB) 4.1 is a widely used, award-winning C++ template library for creating high performance, scalable parallel applications. It includes scalable memory allocation, load-balancing, work-stealing task scheduling, a thread-safe pipeline and concurrent containers, high-level parallel algorithms, and numerous synchronization primitives.
  • Intel® Integrated Performance Primitives (Intel® IPP) 7.1 is an extensive library of software functions to help you develop multimedia, data processing, and communications applications for Windows*, Linux*, and OS X* environments.
  • Intel® Math Kernel Libraries (Intel® MKL) 11.0 is a computing math library of highly optimized, extensively threaded math routines for applications that require maximum performance. Core math functions include BLAS, LAPACK, ScaLAPACK1, sparse solvers, fast Fourier transforms, vector math, and more.
  • Intel Compilers support OpenMP*. Developers using C or C++ can create and manage portable, shared-memory parallelism in their applications by coding in OpenMP directives. If source code is compiled later using compilers that do not support OpenMP, the directives are ignored and the code is compiled for conventional, serial execution.
  • Every purchase of an Intel® Software Development Products includes one year of support services, which provides access to Intel® Premier Support and all product updates during that time. Intel Premier Support gives you online access to technical notes, application notes, and documentation. You can also take advantage of Intel Support Forums (Intel® Visual Fortran for Windows, Fortran for Linux and OS X*, C++.) Join the community—contribute, learn, or just browse!
  • Research and Development. Intel is have been expanding practical application of parallelism development tools and technology for years. And we will continue to do so. For a view into some promising technology, take a look at our What If Experimental Software site.

Benefits

  • Intel® Composer XE 2013 delivers outstanding application performance giving your applications a competitive edge
  • On Windows, it integrates into Visual Studio 2008, 2010 or 2012 (Visual Studio is a prerequisite) and, on Linux, supports the gnu tools to preserve you’re the investment you have in your development environment.
  • C++ source-code compatibility enables you to mix/match compilers as your needs evolve. Cut your time to outstanding performance by using the Intel compiler on the most performance-sensitive parts of your application.
  • The multiple parallel programming models in components, such as Intel® TBB and Intel® Cilk Plus, offer you choice and enhance productivity as you add parallelism to take advantage of today's multicore systems and the Intel® Many Integrated Core Architecture (Intel® MIC Architecture)
  • Intel® Integrated Performance Primitives (Intel® IPP) 7.1, and Intel® Math Kernel Library (Intel® MKL) 11.0 offer library of multicore-ready, highly optimized functions that help speed development of digital media and data-processing applications. They are the easiest route to taking advantage of multicore computer configurations and systems using the Intel MIC Architecture.
  • Compiler features such as auto-parallelization, auto-vectorization and guided auto-parallelization (GAP) can save significant time in delivering outstanding application performance.

Try Intel® Composer XE 2013 today!

Summary

Take Comfort – Intel Composer XE 2013 is compatible with your code and the way you work

Intel Composer XE 2013 integrates into Microsoft Visual Studio* 2008, 2010 and 2012 and supports the gnu tool chain on Linux*. And the C++ compiler produces code that is binary compatible with Visual C++ and gcc. Intel® Fortran on Windows continues to feature full source code compatibility with Compaq Visual Fortran*. This means the investment you have in how you develop, and your code itself, is productively preserved. Your license also supports all IA-32 and Intel® 64 architecture, including Intel MIC Architecture, and includes one year of support. In addition, there’s a great community of developers out there sharing their experiences on our Forums.

Take Advantage – Intel Composer XE 2013 delivers easy-to-use performance features

Intel continues to enhance proven Intel compiler features such as the High-Performance Parallel Optimizer (HPO). This revolutionary capability combines vectorization, parallelization, and loop transformations into a single pass that is faster, more effective, and more reliable than prior discrete phases. The compilers also automatically vectorize code for use on systems with conventional Intel® Xeon®, Intel® Core and compatible AMD processors and include vectorization tools for applications targeting Intel® MIC Architecture. When the compiler can’t vectorize for use on Intel Xeon and Intel Core processors, you can use the Guided Auto-Parallelization (GAP) feature to get a report suggesting changes so your code will vectorize. Interprocedural optimization and profile-guided optimization continue to provide developers with opportunities to enhance performance by in-lining code and restructuring code based on workload, respectively. Performance is #1 at Intel.

Take it Easy – Intel Performance Libraries keep you productive and deliver application performance

Intel Composer XE is a lot more than C++ and Fortran compilers. It includes Intel® Threading Building Blocks 4.1, a widely used, award-winning C++ template library that simplifies creating reliable, portable, low-maintenance and scalable parallel applications. And Intel® Math Kernel Library 11.0 is included. It’s a library of highly optimized, extensively threaded math routines, including BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms vector math and much more. Intel® Integrated Performance Primitives 7.1 is also included. It offers highly optimized, extensively threaded functions for multimedia, compression, data processing, communications and more. Last but not least, Intel Composer XE 2013 includes lots of sample code and tutorials to simplify development with examples and snippets.

Take a Test Drive – See for yourself how Intel Composer XE 2013 can help you

Intel Composer XE 2013 30-day evaluations are available for download from our web site. Click or enter this link: http://intel.ly/sw-tools-eval. Additional detail is presented on the next page but, in general, you’ll need a system with Visual Studio 2008, 2010 or 2012 for a Windows eval. On Linux, you’ll need a system capable of running the gnu tool chain. That’s basically it! The download includes tutorials and lots of code samples, or you can jump right in using your own code. To join the community of your fellow Intel Composer XE developers, visit the Intel Developer Zone Forums or click support.

Try Intel® Composer XE today!

Back to top

Review the resources below to learn how to use Intel Parallel tools. Be sure to go to the Intel Learning Lab Portal for a complete offering of videos, whitepapers, and other resources to learn how to take advantage of this product. Visit the Evaluation Guide Portal for concise, step by step guides to see the power of Intel Development Products.

  • Compilation for Intel® Xeon Phi™ Coprocessor

  • GNU Debugger Intel® Xeon Phi™ Coprocessor

  • Get Ready for Intel® Math Kernel Library on Intel® Xeon Phi™ Coprocessor

  • Expressing Parallelism and Vector Operations for Optimal Performance using Intel® Composer XE

  • Live demonstration showing programming techniques for Intel® Xeon Phi Coprocessor

  • Setting up an Intel® TBB Project in Microsoft Visual Studio
  • Julian Horn, Architect talks about Static Analysis
  • Where can I get an evaluation copy of Intel® Composer XE 2013?
  • Right here! Windows* and Linux*

  • How do I tell Visual Studio* to use the Intel® C++ Compiler?
  • One easy way to do this is to load your solution into Visual Studio and then right-click over the solution name in the Solution Explorer. You’ll see a popup. Near the bottom you’ll see “Intel Composer XE 2013.” Mouse over that and click “Use Intel C++.”

  • Where is the sample code that ships with Intel Composer XE 2013?
  • On Windows, if you used the default installation paths, you can find the sample code in the following directory: Local Disk (C:)/Program Files (x86)/Intel/Composer XE 2013/Samples/en_US/C++. The Fortran sample code is in the same directory …/ en_US/Fortran.

    Note that all sample code is installed in zip files. When you’re ready to use them, we suggest you unzip sample file contents to a separate directory. This preserves the sample contents for later use. When you try to load some sample code, Visual Studio might walk you through a project conversion. Just click through this and you’re sample code will be loaded into Visual Studio!

  • Where is the Intel® Composer XE 2013 “Getting Started” tutorial?
  • Click here!

  • I’m trying to use Intel® Cilk™ Plus, Intel® Math Kernel Library (Intel® MKL) 11.0, Intel® Integrated Performance Primitives (Intel® IPP) 7.1 or Intel® Threading Building Blocks (Intel® TBB) 4.1. I put the keywords, syntax and library calls into my source but I get errors. What’s up?
  • For more information, take advantage of Intel Forums and the Knowledge Base and check product documentation. But here’s something that might be helpful: make sure you included the proper header files in your projects.

Intel® Composer XE 2013

Getting Started?

Click the Learn tab for guides and links that will quickly get you started.

Get Help or Advice

Search Support Articles
Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

Resources

Release Notes (Intel® C++ Composer XE | Intel® Fortran Composer XE) - View Release Notes online!
Intel® Composer XE documentation - Intel® C++ Composer XE | Intel® Fortran Composer XE
Documentation for other software products

Intel® Math Kernel Library 11.0

Getting Started?

Click the Learn tab for guides and links that will quickly get you started.

Get Help or Advice

Search Support Articles
Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

Resources

Release Notes - View Release Notes online!
Fixes List - View Compiler Fixes List
Intel® Math Kernel Library 11.0 documentation - View documentation online!
Documentation for other software products

Intel® Integrated Performance Primitives 7.1

Getting Started?

Click the Learn tab for guides and links that will quickly get you started.

Get Help or Advice

Search Support Articles
Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

Resources

Release Notes - View Release Notes online!
Fixes List - View Compiler Fixes List
Product Documentation - View documentation online!

Intel® Threading Building Blocks 4.1

Getting Started?

Click the Learn tab for guides and links that will quickly get you started.

Get Help or Advice

Search Support Articles
Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

Resources

Release Notes - View Release Notes online!
Product Documentation - View documentation online!

Intel® C++ Compiler for Linux*

Getting Started?

Click the Learn tab for guides and links that will quickly get you started.

Get Help or Advice

Search Support Articles
Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

Resources

Release Notes - View Release Notes online!
Fixes List - View Compiler Fixes List
Checksums - View Product Checksums
Product Documentation - View documentation online!

Intel® C++ Compiler for Windows*

Getting Started?

Click the Learn tab for guides and links that will quickly get you started.

Get Help or Advice

Search Support Articles
Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

Resources

Release Notes - View Release Notes online!
Fixes List - View Compiler Fixes List
Checksums - View Product Checksums
Product Documentation - View documentation online!

Intel® C++ Compiler for OS X*

Getting Started?

Click the Learn tab for guides and links that will quickly get you started.

Get Help or Advice

Search Support Articles
Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

Resources

Release Notes - View Release Notes online!
Fixes List - View Compiler Fixes List
Checksums - View Product Checksums
Product Documentation - View documentation online!

Intel® Visual Fortran Compiler

Getting Started?

Click the Learn tab for guides and links that will quickly get you started.

Get Help or Advice

Search Support Articles
Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

Resources

Release Notes - View Release Notes online!
Fixes List - View Compiler Fixes List
Checksums - View Product Checksums
Product Documentation - View documentation online!

Featured Support Topics

Nessun contenuto trovato

Intel® Fortran Compiler for Linux*

Getting Started?

Click the Learn tab for guides and links that will quickly get you started.

Get Help or Advice

Search Support Articles
Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

Resources

Release Notes - View Release Notes online!
Fixes List - View Compiler Fixes List
Checksums - View Product Checksums
Product Documentation - View documentation online!

Intel® Fortran Compiler for OS X*

Getting Started?

Click the Learn tab for guides and links that will quickly get you started.

Get Help or Advice

Search Support Articles
Forums - The best place for timely answers from our technical experts and your peers. Use it even for bug reports.
Support - For secure, web-based, engineer-to-engineer support, visit our Intel® Premier Support web site. Intel Premier Support registration is required.
Download, Registration and Licensing Help - Specific help for download, registration, and licensing questions.

Resources

Release Notes - View Release Notes online!
Fixes List - View Compiler Fixes List
Checksums - View Product Checksums
Product Documentation - View documentation online!