Intel® C++ Compiler Classic Developer Guide and Reference

ID 767249
Date 12/16/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Language Support for Auto-parallelization

This topic addresses specific C++ language features that better help to parallelize code.

Annotating Functions with Declarations

Annotating functions with the declaration:

//  (Windows* OS)
__declspec(concurrency_safe(cost(cycles) | profitable)) 
-OR- 
//  (Linux* OS)
__attribute__(concurrency_safe(cost(cycles) | profitable)) 
guides the compiler to parallelize more loops and straight-line code.

Using the concurrency_safe attribute indicates to the compiler that there are no unaffected side-effects and no illegal (or improperly synchronized) memory access interfences among multiple invocations of the annotated function or between an invocation of this annotated function and other statements in the program, if they are executed concurrently.

NOTE:

For every function that is annotated with the concurrency_safe attribute, it is your responsibility to ensure that its side effects (if any) are acceptable (or expected), and the memory access interferences are properly synchronized.

The cost clause specifies the execution cycles of the annotated function for the compiler to perform parallelization profitability analysis while compiling its enclosing loops or blocks. The profitable clause indicates that the loops or blocks that contain calls to the annotated function are profitable to parallelize.

NOTE:

The value of cycles is a 2-byte unsigned integer (unsigned short), its maximal value is 2^16-1. If the cycle count is greater than 2^16-1, the user should use profitable clause.

The following example illustrates the use of this declaration.

Example using __declspec(concurrency_safe(cost(cycles) | profitable))
#define N 10 
#define M 40 
#define NValue N
 
#if defined(COSTLOW)
 
// The function cost is ~5 cycles, the loop calling "foo" will not be parallellized
__declspec(concurrency_safe(cost(5))) 
#elif defined(COSTHIGH)
 
// The function cost is ~100 cycles, so the loop calling "foo" will be paralleized
__declspec(concurrency_safe(cost(200))) 
#elif defined(PROFITABLE)
 
// The function is profitable to be executed in parallel, so the loop calling "foo" 
// should be paralleized.
__declspec(concurrency_safe(profitable)) 
#endif
 
__declspec(noinline) 
int foo(float A[], float B[]) {
   for (int i = 0; i < N; i++) {
   B[i] = A[i];
   }
   return N; 
}
 
int testp(float A[], float B[], float* In[], float* Out[]) {
   int i, j;
   for (i = 0; i < M; i++) {
     foo (A, B);
     for (j = 0; j < N; j++) {
       Out[i][j] = In[i][j] + (NValue*j);
     }
   }
   return N; 
}
 
[C:/temp] icl -c -DCOSTLOW -Qparallel -Qpar-report2 -Qansi-alias v.cpp 
C:\temp\v.cpp(28): (col. 3) remark: loop was not parallelized: insufficient computational work.
 
[C:/temp] icl -c -DCOSTHIGH -Qparallel -Qpar-report -Qansi-alias v.cpp 
C:\temp\v.cpp(28): (col. 3) remark: LOOP WAS AUTO-PARALLELIZED.
 
[C:/temp] icl -c -DPROFITABLE -Qparallel -Qpar-report -Qansi-alias v.cpp 
C:\temp\v.cpp(28): (col. 3) remark: LOOP WAS AUTO-PARALLELIZED.