Three progamming points to mention on Offloaded Code for Intel® Graphics Technology

Intel® Graphic Technology is a supported part of the compiler product. Developers should adhere to the programming guidelines in order to benefit from the compiler and GT features efficiently.

1."#pragma offload target(gfx)" is required to mark the parallel loop as an "offload region".  The "__declspec(target(gfx))" does not do that.  It merely states that the function should be compiled to run on the GFX target.

For example, the following incorrect code snippet use is from a customer:

__declspec(target(gfx)) void function(double *vector) {
#pragma parallel_loop
//do something with the vector in a parallel loop
}

Now when trying to compile the code using: icl test.c /Qoffload /Qstd=c99 /Qopenmp ,you would get the following error says "a statement with parallel loop pragma must also have an offload target(gfx) pragma before the for loop in the function". So ,don't forget to add offload target(gfx) pragma in this case.

2.You can place #pragma offload target(gfx) only before a perfect loop nest explicitly marked as parallel by #pragma parallel_loop.

For example, the following code prints var = 55, i = 0.
//
int var = 55;     
int i = 0;
#pragma offload target(gfx)
#pragma parallel_loop
for (i = 0; i < 1; i++)
{
   ++var;
}
printf("var = %d, i = %d\n", var, i);
//

3.Code properly whenever calling the function from an offload region or want the body of the function to be the offload region.

1) If you want to call the function from an offload region then it needs to be declared with the target(gfx) attribute but not have the #pragma parallel loop in the body, e.g.

  __declspec(target(gfx)) void function(double *vector) {
    // do something with the vector in a parallel loop
  }
...
#pragma offload ..
#pragma parallel_loop
for (...) {
   ...
   function(vec);
}

2) If you want the body of the function to be the offload region then it should be coded like this:

void function(double *vector) {
#pragma offload
#pragma parallel_loop
  // do something with the vector in a parallel loop
  }

For more information ,please refer to the formal documentation shipped with the Compiler 14.0 SP1.

 

--End of article.

 

 

For more complete information about compiler optimizations, see our Optimization Notice.