Combining pragma offload and Cilk shared?

Combining pragma offload and Cilk shared?

My goal is to create a class that can handle data transfers to and from the MIC, and to be able to offload calls that apply a functor to the data.

I can do this successfully using the #pragma offload commands so long as the functor to be applied is actually just a function:

#include <stdio.h>
#include <iostream>
#define SIZE 8
template<typename T>
class Functor
{
public:
  T operator()(T input)
  {
    #ifdef __MIC__
      printf("Running on MIC\n");
    #else
      printf("No MIC\n");
    #endif
    return (4*input);
  }
};
class Data
{
public:
  Data()
  {
    rawData = new int[SIZE];
    for (unsigned int i=0; i<SIZE; i++) rawData[i] = 1;
    int* localData = rawData;
    #pragma offload_transfer target(mic:0) in(localData : length(SIZE) alloc_if(1) free_if(0))
  };
  void output()
  {
    int* localData = rawData;
    #pragma offload_transfer target(mic:0) out(localData : length(SIZE) alloc_if(0) free_if(0))
    for (unsigned int i=0; i<SIZE; i++) std::cout << localData[i] << " ";
    std::cout << std::endl;
  }
  int* rawData;
};
template<typename UnaryFunction>
void functorCall(int* fdata, UnaryFunction uf)
{
  int* localData = fdata;
  #pragma offload target(mic:0) in(localData : length(0) alloc_if(0) free_if(0))
  {
    for (unsigned int i=0; i<SIZE; i++)
      localData[i] = uf(localData[i]);
  }
}
int main(int argc, char* argv[])
{
  Data* data = new Data();
  functorCall(data->rawData, Functor<int>());
  data->output();
  return 0;
}

[cs@cn152 offload]$ icc -O2 -offload-attribute-target=mic -o test1 test1.cpp
[cs@cn152 offload]$ ./test1
4 4 4 4 4 4 4 4
Running on MIC
Running on MIC
Running on MIC
Running on MIC
Running on MIC
Running on MIC
Running on MIC
Running on MIC

However, if the functor includes data or a constructor, the function is no longer bitwise-copyable, so this does not work:

#include <stdio.h>
#include <iostream>
#define SIZE 8
template<typename T>
class Functor
{
public:
  Functor() {}
  T operator()(T input)
  {
    #ifdef __MIC__
      printf("Running on MIC\n");
    #else
      printf("No MIC\n");
    #endif
    return (4*input);
  }
};
class Data
{
public:
  Data()
  {
    rawData = new int[SIZE];
    for (unsigned int i=0; i<SIZE; i++) rawData[i] = 1;
    int* localData = rawData;
    #pragma offload_transfer target(mic:0) in(localData : length(SIZE) alloc_if(1) free_if(0))
  };
  void output()
  {
    int* localData = rawData;
    #pragma offload_transfer target(mic:0) out(localData : length(SIZE) alloc_if(0) free_if(0))
    for (unsigned int i=0; i<SIZE; i++) std::cout << localData[i] << " ";
    std::cout << std::endl;
  }
  int* rawData;
};
template<typename UnaryFunction>
void functorCall(int* fdata, UnaryFunction uf)
{
  int* localData = fdata;
  #pragma offload target(mic:0) in(localData : length(0) alloc_if(0) free_if(0))
  {
    for (unsigned int i=0; i<SIZE; i++)
      localData[i] = uf(localData[i]);
  }
}
int main(int argc, char* argv[])
{
  Data* data = new Data();
  functorCall(data->rawData, Functor<int>());
  data->output();
  return 0;
}

[cs@cn152 offload]$ icc -O2 -offload-attribute-target=mic -o test2 test2.cpp
test2.cpp(42): error: variable "uf" used in this offload region is not bitwise copyable
    #pragma offload target(mic:0) in(localData : length(0) alloc_if(0) free_if(0))
    ^
          detected during instantiation of "void functorCall(int *, UnaryFunction) [with UnaryFunction=Functor<int>]" at line 63

compilation aborted for test2.cpp (code 2)

So, I tried using the implicit memory copy model with the Cilk directives.  This allowed me to use the full-featured functor, so long as my data was global:

#include <stdio.h>
#include <iostream>
#define SIZE 8
int* _Cilk_shared globalData;
template<typename T>
class Functor
{
public:
    Functor() {};
  _Cilk_shared T operator()(T input)
  {
    #ifdef __MIC__
      printf("Running on MIC\n");
    #else
      printf("No MIC\n");
    #endif
    return (4*input);
  }
};
template<typename UnaryFunction>
_Cilk_shared void functorCall(UnaryFunction uf)
{
  for (unsigned int i=0; i<SIZE; i++)
    globalData[i] = uf(globalData[i]);
}
int main(int argc, char* argv[])
{
  globalData = (_Cilk_shared int*)_Offload_shared_malloc(SIZE*sizeof(int));
  for (unsigned int i=0; i<SIZE; i++) globalData[i] = 1;
  _Cilk_offload functorCall(Functor<int>());
  for (unsigned int i=0; i<SIZE; i++) std::cout << globalData[i] << " ";
  std::cout << std::endl;
  return 0;
}

[cs@cn152 offload]$ icc -O2 -o test3 test3.cpp
[cs@cn152 offload]$ ./test3
4 4 4 4 4 4 4 4
Running on MIC
Running on MIC
Running on MIC
Running on MIC
Running on MIC
Running on MIC
Running on MIC
Running on MIC

However, if I try to use non-global memory, such as a class member, I get lots of warnings:

#include <stdio.h>
#include <iostream>
#define SIZE 8
template<typename T>
class Functor
{
public:
    Functor() {};
  _Cilk_shared T operator()(T input)
  {
    #ifdef __MIC__
      printf("Running on MIC\n");
    #else
      printf("No MIC\n");
    #endif
    return (4*input);
  }
};
class _Cilk_shared Data
{
public:
  Data()
  {
    rawData = (_Cilk_shared int*)_Offload_shared_malloc(SIZE*sizeof(int));
    for (unsigned int i=0; i<SIZE; i++) rawData[i] = 1;
  };
  void output()
  {
    for (unsigned int i=0; i<SIZE; i++) std::cout << rawData[i] << " ";
    std::cout << std::endl;
  }
  int* _Cilk_shared rawData;
};
template<typename UnaryFunction>
_Cilk_shared void functorCall(int* fdata, UnaryFunction uf)
{
  for (unsigned int i=0; i<SIZE; i++)
    fdata[i] = uf(fdata[i]);
}
int main(int argc, char* argv[])
{
  Data* data = new Data();
  _Cilk_offload functorCall(data->rawData, Functor<int>());
  data->output();
  return 0;
}

[cs@cn152 offload]$ icc -O2 -o test4 test4.cpp
test4.cpp(32): warning #2696: _Cilk_shared may not be applied to non-static fields of a class/struct
    int* _Cilk_shared rawData;
                      ^
test4.cpp(43): warning #2571: variable has not been declared with compatible "_Cilk_shared" attribute
    _Cilk_offload functorCall(data->rawData, Functor<int>());
                              ^
test4.cpp(43): warning #2707: pointer argument in _Cilk_offload function call is not pointer-to-shared
    _Cilk_offload functorCall(data->rawData, Functor<int>());
                              ^
test4.cpp(32): warning #2696: *MIC* _Cilk_shared may not be applied to non-static fields of a class/struct
    int* _Cilk_shared rawData;
                      ^
test4.cpp(43): warning #2571: *MIC* variable has not been declared with compatible "_Cilk_shared" attribute
    _Cilk_offload functorCall(data->rawData, Functor<int>());
                              ^
test4.cpp(43): warning #2707: *MIC* pointer argument in _Cilk_offload function call is not pointer-to-shared
    _Cilk_offload functorCall(data->rawData, Functor<int>());

Interestingly, though, despite the warnings, it actually still seems to work correctly, at least in this case:

[cs@cn152 offload]$ ./test4
4 4 4 4 4 4 4 4
Running on MIC
Running on MIC
Running on MIC
Running on MIC
Running on MIC
Running on MIC
Running on MIC
Running on MIC

I know the documentation says that _Cilk_shared may not be applied to non-static fields of a class/struct, so I understand the warnings in that sense.  However, is there any way to do what I am trying to do?  In other words, is there a way to (a) handle functors with the #pragma offload directives, (b) handle data pointers that are class members with the implicit _Cilk_shared model, (c) combine a and b in some way, or (d) some other way to do this?

Or should I just ignore these warnings, since it seems to work in this case?  (Or maybe it is not really working, and just appears to based on these simple diagnostics.)  In more complex code, it does not seem to work, but perhaps I have a different problem in that case.

Thanks!

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Any ideas or pointers to more detailed documentation on offloading?

I have been reading http://software.intel.com/sites/products/documentation/doclib/iss/2013/c..., but haven't yet found a way to do what I am trying to do.

Thanks!

Looking at your last example, if your intent is to keep the object "data" in non-shared memory but allow the rawData pointer in it to point to shared memory then some changes are needed in your program:

1. Change the declaration on line 35 to "_Cilk_shared int * rawData;" because that's how you declare a "pointer to shared int".

2. Change the declaration on line 39 to "_Cilk_shared void functorCall(_Cilk_shared int * fdata, UnaryFunction uf);" so that the type of the first parameter matches the type of the argument used to call the function.

Now you'll observe that the compiler issues a warning on line 48. This is a spurious warning and we will fix the compiler to accept that offload call as written. For now, to avoid the warning, copy the function argument into a local variable and use that variable as the function argument. The warning does not affect runtime behavior and the program runs correctly.

 // Workaround to avoid compiler warning

  _Cilk_shared int* rawData = data->rawData;

  _Cilk_offload functorCall(rawData, Functor<int>());

Thanks!

Leave a Comment

Please sign in to add a comment. Not a member? Join today