Fortran vs. C offload directives and functions

This is a "cheatsheet" comparing the Fortran and C++ offload directives and functions in the context of programming for the Intel® Xeon Phi™ Coprocessor

 

Fortran (explicit model)

C/C++ (explicit model)

C/C++ (implicit model)

includes

use mic_lib #include offload.h #include cilk.h

functions

result = offload_number_of_devices() result =
_Offload_number_of_devices()
 

 

result = offload_get_device_number() result =
_Offload_get_device_number()
 

 

offload_report(n)

__Offload_report(n)

 

 

omp_set_num_threads_target( TARGET_MIC, mic_num, num_threads) omp_set_num_threads_target( TARGET_MIC, mic_num, num_threads)  

 

More APIs can be found in /opt/intel/include/intel64/mic_lib.f90 More APIs can be found in /opt/intel/include/offload.h More APIs can be found in /opt/intel/include/offload.h
environment variables same for Fortran and C++

preprocessor macros

same for Fortran and C++ but note - the macros are used with #ifdef MACRO_NAME ...#else ... #endif and require that the Fortran file end in F90 rather than f90 or that the command line includes the -fpp option

Directives


Offload next statement

!dir$ offload target
(mic
[:n])<opt_offload_clauses>
<statement>

where <statement> is

call subroutine_name(args)

or

ret_val = function_name(args)

#pragma offload target(mic[:n]) <opt_offload_clauses>
<statement>

where <statement> is any valid C++ statement including compound statements, for example - if statement, for statement, or simple block statement such as {s1;s2;s3;…}

ret_val =
_Cilk_offload

function_name(args)

ret_val =_Cilk_offload_to n function_name(vars)

ret_val =
_Cilk_spawn _Cilk_offload
function_name(args)

ret_val =
_Cilk_spawn _Cilk_offload_to
n function_name(args)

Offload enclosed block of code

!dir$ offload begin target
(mic
[:n])<opt_offload_clauses>

<statements>

!dir$ end offload

See note above about compound statements  

Offload OpenMP parallel section (or cilk_for construct)

!dir$ [omp] offload target(mic[:n]) <opt_offload_clause>

!$omp<parallel_directive>

<statements>

!$omp <end_directive>

#pragma offload target(mic[:n]) <opt_offload_clauses>

#pragma omp<parallel_directive>

<compound_statement>

ret_val =
_Cilk_offload _Cilk_for
(init-expr; test-expr; incr-expr) {statements}

ret_val =
_Cilk_offload_to
n _Cilk_for ( init-expr; test-expr; incr-expr) {statements}

Start asynchronous data transfer to Coprocessor

!dir$ offload_transfer <in_offload_clauses> signal(signal_var)

#pragma offload_transfer<in_offload_clauses>
signal(&signal_var)
 

Complete asynchronous data transfer from Coprocessor

!dir$ offload_transfer wait(signal_var) <out_offload_clauses> #pragma offload_transfer
wait(&
signal_var)
<out_offload_clauses>
 

Offload wait

!dir$ offload_wait(signal_var) #pragma offload_wait(&signal_var)  

Mark a function or subroutine as needing both a host and Coprocessor version

!dir$ attributes offload:mic :: routine_name __attribute__ ((target(mic))) function_declaration

__declspec(target(mic)
)
function_delaration

function_type C_Cilk_offload function_declaration

Mark a global variable as needing to be allocated memory on both the host and Coprocessor

!dir$ attributes offload:mic :: var_name __attribute__ ((target(mic)))variable_declaration

__declspec (target(mic))variable_declaration

_Cilk_shared variable_declaration

Mark everything in the enclosed region as needing both host and Coprocessor versions

!dir$ options /offload-attribute-target=mic
!dir$ end options
(Valid only in declarations section of subroutine or function)
#pragma offload_attribute
(push,target(mic))

#pragma offload_attribute(pop)

#pragma offload_attribute
( push, _Cilk_shared)

#pragma offload_attribute
(pop)

Allocate memory in the shared memory areas for host and Coprocessor

   

ptr =
_Offload_shared_malloc (
size)

ptr =
_Offload_shared_aligned_malloc
(data_size,alignment_size)

_Offload_shared_free(ptr)

_Offload_shared_aligned_free
(
ptr)


Offload clauses used in directives


 

 

if(condition)

where condition evaluates to .true. or .false.

if(condition)

where condition evaluates to 0 or 1

 
 

 

signal(signal_var)

signal(&signal_var)

 
 

 

wait(var)

wait(&var)

 

identical in Fortran and C++

in(var_list[:modifiers]) in(var_list[:modifiers])  
out(var_list[:modifiers]) out(var_list[:modifiers])  
inout(var_list[:modifiers]) inout(var_list[:modifiers])  
nocopy(var_list[:modifiers]) nocopy(var_list[:modifiers])  
 

Modifiers that can be used with in, out, inout and nocopy clauses


  length(num_elem) length(num_elem)  
 

alloc_if(condition)

where condition evaluates to .true. or .false.

alloc_if(condition)

where condition evaluates to 1 or 0

 
  free_if(condition)

where condition evaluates to .true. or .false.

free_if(condition)

where condition evaluates to 1 or 0

 
  align(n) align(n)  
  alloc([first_index:last_index]) alloc([first_index:element_count])  
  into(var_name) into(var_name)  
For more complete information about compiler optimizations, see our Optimization Notice.
Tags: