Fortran vs. C Offload Directives and Functions

By Belinda M Liviero, published on December 20 , 2012

This is a "cheatsheet" comparing the Fortran and C++ offload directives and functions in the context of programming for the Intel® Xeon Phi™ coprocessor

  Fortran (explicit model) C/C++ (explicit model) C/C++ (implicit model)
includes use mic_lib #include offload.h #include cilk.h
functions result = offload_number_of_devices() result =
_Offload_number_of_devices()
 
  result = offload_get_device_number() result =
_Offload_get_device_number()
 
 

offload_report(n)

__Offload_report(n)

 
  omp_set_num_threads_target( TARGET_MIC, mic_num, num_threads) omp_set_num_threads_target( TARGET_MIC, mic_num, num_threads)  
  More APIs can be found in /opt/intel/include/intel64/mic_lib.f90 More APIs can be found in /opt/intel/include/offload.h More APIs can be found in /opt/intel/include/offload.h
environment variables same for Fortran and C++
preprocessor macros same for Fortran and C++ but note - the macros are used with #ifdef MACRO_NAME ...#else ... #endif and require that the Fortran file end in F90 rather than f90 or that the command line includes the -fpp option

Directives

Offload next statement

!dir$ offload target
(mic
[:n])<opt_offload_clauses>
<statement>

where <statement> is

call subroutine_name(args)

or

ret_val = function_name(args)

#pragma offload target(mic[:n]) <opt_offload_clauses>
<statement>

where <statement> is any valid C++ statement including compound statements, for example - if statement, for statement, or simple block statement such as {s1;s2;s3;…}

ret_val =
_Cilk_offload

function_name(args)

ret_val =_Cilk_offload_to n function_name(vars)

ret_val =
_Cilk_spawn _Cilk_offload
function_name(args)

ret_val =
_Cilk_spawn _Cilk_offload_to
n function_name(args)

Offload enclosed block of code

!dir$ offload begin target
(mic
[:n])<opt_offload_clauses>

<statements>

!dir$ end offload

See note above about compound statements  
Offload OpenMP parallel section (or cilk_for construct) !dir$ [omp] offload target(mic[:n]) <opt_offload_clause>

!$omp<parallel_directive>

<statements>

!$omp <end_directive>
#pragma offload target(mic[:n]) <opt_offload_clauses>

#pragma omp<parallel_directive>

<compound_statement>

ret_val =
_Cilk_offload _Cilk_for
(init-expr; test-expr; incr-expr) {statements}

ret_val =
_Cilk_offload_to
n _Cilk_for ( init-expr; test-expr; incr-expr) {statements}

Start asynchronous data transfer to Coprocessor

!dir$ offload_transfer <in_offload_clauses> signal(signal_var)

#pragma offload_transfer<in_offload_clauses>
signal(&signal_var)
 
Complete asynchronous data transfer from Coprocessor !dir$ offload_transfer wait(signal_var) <out_offload_clauses> #pragma offload_transfer
wait(&
signal_var)
<out_offload_clauses>
 
Offload wait !dir$ offload_wait(signal_var) #pragma offload_wait(&signal_var)  
Mark a function or subroutine as needing both a host and Coprocessor version !dir$ attributes offload:mic :: routine_name __attribute__ ((target(mic))) function_declaration

__declspec(target(mic)
)
function_delaration

function_type C_Cilk_offload function_declaration
Mark a global variable as needing to be allocated memory on both the host and Coprocessor !dir$ attributes offload:mic :: var_name __attribute__ ((target(mic)))variable_declaration

__declspec (target(mic))variable_declaration

_Cilk_shared variable_declaration
Mark everything in the enclosed region as needing both host and Coprocessor versions !dir$ options /offload-attribute-target=mic
!dir$ end options
(Valid only in declarations section of subroutine or function)
#pragma offload_attribute
(push,target(mic))

#pragma offload_attribute(pop)

#pragma offload_attribute
( push, _Cilk_shared)

#pragma offload_attribute
(pop)

Allocate memory in the shared memory areas for host and Coprocessor    

ptr =
_Offload_shared_malloc (
size)

ptr =
_Offload_shared_aligned_malloc

(data_size,alignment_size)

_Offload_shared_free(ptr)

_Offload_shared_aligned_free
(
ptr)

Offload Clauses Used in Directives

   

if(condition)

where condition evaluates to .true. or .false.

if(condition)

where condition evaluates to 0 or 1

 
   

signal(signal_var)

signal(&signal_var)

 
   

wait(var)

wait(&var)

 
identical in Fortran and C++ in(var_list[:modifiers]) in(var_list[:modifiers])  
out(var_list[:modifiers]) out(var_list[:modifiers])  
inout(var_list[:modifiers]) inout(var_list[:modifiers])  
nocopy(var_list[:modifiers]) nocopy(var_list[:modifiers])  
 

Modifiers that Can Be Used with In, Out, Inout and Nocopy Clauses

  length(num_elem) length(num_elem)  
 

alloc_if(condition)

where condition evaluates to .true. or .false.

alloc_if(condition)

where condition evaluates to 1 or 0

 
  free_if(condition)

where condition evaluates to .true. or .false.

free_if(condition)

where condition evaluates to 1 or 0

 
  align(n) align(n)  
  alloc([first_index:last_index]) alloc([first_index:element_count])  
  into(var_name) into(var_name)  

1

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserverd for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804