Restrictions on Offloaded Code Using a Pragma

This topic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).

Offloaded code has the following restrictions:

  • Exception handling may be done as usual within code running on the CPU and within code running on the coprocessor. So exceptions can be raised, caught and handled on the CPU, or raised, caught and handled on the coprocessor. However it is not possible to propagate an exception from the coprocessor to the CPU.

  • Do not use the __MIC__ macro inside the statement following a #pragma offload statement. You can, however, use this macro in a subprogram called from the pragma.

  • The compiler does not inline functions inside statements following an offload pragma statement, including functions marked with the forceinline pragma. However, inline function expansion might occur at the compiler’s discretion inside a subprogram called from the offload pragma.

  • Multiple host CPU threads can execute concurrently while one host CPU thread offloads a section of code. In this case synchronization mechanisms such as locks, atomics, mutexes, OpenMP atomic operations, OpenMP critical, OpenMP taskwait, OpenMP barriers, etc. do not work between host CPU code and code offloaded to the target. However, if parallelism on the host CPU is enabled using OpenMP, then OpenMP synchronization at the end of a parallel region is guaranteed to work even if some part of the OpenMP parallel region has been offloaded to the target.

  • Global variables referenced by functions called from within offloaded code must be declared with matching target attributes to ensure that the variable is available on the target. The offloaded code cannot access the host CPU global variables. This is enforced by the compiler.

    By default, pointer variables are assumed to point to a single element of the corresponding type. The offloaded code may dereference the pointer and access a single element. The data element pointed to is automatically copied into target memory and the pointer value adjusted accordingly. The element-count-expr expression available with in / out / inout parameters enable variable-length data to be copied back and forth.

  • Only pointers to non-pointer types are supported. Pointers to pointer variables are not supported. The compiler enforces this restriction.

  • Arrays are supported provided the array element type is a scalar or bitwise copyable struct or class. So arrays of pointers are not supported.

  • Because pointers are not copied across the host-target interface, but instead the data they point to is copied, do not assume that the relative distance between pointers that point to distinct variables remains the same between the host and target. Pointers within the same data structure still have the same distance between them after offload. Thus, some pointer comparisons and arithmetic that were meaningful on the host CPU can no longer be used reliably on the target.

  • Similarly, although the data pointed to is available after an offload, the program cannot assume that the same user variable is pointed to after offload. For example, consider the following line of code:

    {int a = 55; int *p = &a; #pragma offload { q = p; ... }

    Although q on the target will point to the value 55, the value of q will not be &a on the target.

  • Unions containing a combination of pointer and non-pointer members are treated as holding the non-pointer value type. Thus, no special treatment is given to the pointer, and the data pointed to is not copied to the target.

  • Unions consisting entirely of pointer members are not allowed to be copied between the host and the target.

  • If an offloaded statement calls a function defined in a separate file and that function references global variables, then those global variables cannot be copied in or out because the references are not visible to the compiler. Those variables are copied in or out if they are also referenced in the offloaded statement.

    Global variables such as these must be explicitly named in in and out clauses in the offload specification. When these global variables are file-scope static variables, then they cannot be named in the in or out clauses. You need to access their values using one of the following methods:

    • Make them external and add them to the in or out clauses in the offload specification.

    • Fetch the variable values into local variables using functions designed specifically for that purpose, and then add the local variables to the in or out clauses.

  • You cannot use objects that are not bitwise copyable, such as the ostream object std::cout, inside a #pragma offload region. The compiler enforces this restriction and issues an error such as: error: variable "std::cout" is not bitwise copyable .

  • There are three Intel® Cilk™ Plus constructs: _Cilk_spawn, _Cilk_sync, and _Cilk_for. You can use all of these in functions called from a #pragma offload construct, but you can only use _Cilk_for directly within the offloaded construct. You cannot use _Cilk_spawn and _Cilk_sync within a #pragma offload construct, because it is illegal to offload only a portion of an Intel® Cilk™ Plus spawning routine. The whole spawning routine must be offloaded.

    For example, the following code is illegal:

    #pragma offload target(mic)
    {
         _Cilk_spawn f();     // _Cilk_spawn used within 
    }                         // offloaded construct
    

    The following code is legal:

    #pragma offload target(mic)
    {
                    g();
    }
    
    ...
    
    void g()
    {
         _Cilk_spawn f();     // _Cilk_spawn used within
    }                         // offloaded function
    

    The following code is illegal:

    Void foo()   
    {
       #pragma offload target(mic) {
              _Cilk_spawn test()           // cannot use inside offload
              test();
              _Cilk_sync;                 // cannot use inside offload
       }
    }
    

    The following code is legal:

    __declspec(target(mic))
    void foo()    {
       #pragma offload target(mic) {
              _Cilk_spawn test()               
              test();
              _Cilk_sync;              
       }
    }
    
    void bar()    {
       #pragma offload target(mic) {
              foo();
       }
    }
    

    The following code is legal:

    void foo() {
    #pragma offload target(mic)
      {
        _Cilk_for (int i = 0; i < 10; i++) {  // Can use directly within an offload region
          // ...
        }
      }
    }
    
For more complete information about compiler optimizations, see our Optimization Notice.