Intel® Parallel Composer

cilkview and gcc 4.9 cilkplus branch on 64 bit linux

Hello, I've downloaded the latest (I think) version fo cilkview from the cilkplus website. For some reason it is not working (even if I try to run it from the same folder, it doesn't find the program...) I'm thinking it has to do with some 32 vs 64-bit library. I exported the lib64 to no avail. Is there any kind of support for cilkview? If not, will you guys just release the source, so we can patch it? This would be great. Same with cilkprof.

Thanks for your time.

Is it possible to me to improve a simple O(n) algorithm with Cilk Plus?

Hi everybody!

I have a simple algorithm to print the very first unique ASCII character from a stream. In the worst case the algorithm scans through the stream twice, doing some comparisons and increments. So, roughly, my algorithm is O(n).

The algorithm0 do two main things:

vectorizing with an inline function?

I attached two code files mandel1.cpp and mandel2.cpp.

mandel1.cpp has a loop with all the code in the body

mandel2.cpp has equivalent code but instead of having the code in the body it calls an inline function

Compiling with intel c++ compiler 15 with "icc  -O3 -fp-model fast=2 -xCORE-AVX2 -fma -c -S", I can vectorize mandel1.cpp but not mandel2.cpp.

Is there I way I can vectorize mandel2.cpp and still have a separate function? It seems like the optimizer ought to just be able to inline and then apply the vectorization if it can vectorize mandel1.cpp.

How to compile cilk plus runtime source with Intel® C++ Composer XE 2013

Dear all,

I want to compile cilk plus runtime source with Intel® C++ Composer XE 2013. I build the cilk plus runtime according to the directions in the "readme" file (libtoolize; aclocal; automake --add-missing; autoconf; ./configure; make; make install). But in this way, gcc is used by default.

Please, could somebody give me some guidelines in order to compile cilk plus runtime source with Intel® C++ Composer XE 2013? 

Thanks a lot for your help.

Best Regards,

Yaqiong Peng

在英特尔® 至强融核™ 协处理器上使用面向卸载的英特尔® 语言扩展 (LEO) 在非连续阵列元素之间传输数据

面向 C++ Windows* 和 Linux* 的英特尔® Parallel Studio XE 2015 编译器版本提供了一款增强功能,支持在英特尔® 至强融核™ 协处理器上使用面向卸载的英特尔® 语言扩展 (LEO) 在非连续阵列元素之间传输数据。

该功能在 LEO 卸载数据编组模型下添加了支持,以便使用 #pragma offload/offload_transfer 语句的数据传输子句(如 in、out、inout、nocopy)在阵列变量引用 (variable-ref) 中传输非连续阵列元素。

在下载数据编组模型下,每个数据传输子句 (in、out、inout、nocopy) 共用一个通用的基本语法,具体见下文。 该增强功能支持以 c-shape 规范为步长指定一个值,具体如下。

语法:
            #pragma offload clause [ clause …]

  • Sviluppatori
  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8.x
  • Server
  • C/C++
  • Intermedio
  • Compilatore C++ Intel®
  • Intel® C++ Composer XE
  • Intel® Composer XE
  • Intel® Parallel Composer
  • Strumenti di sviluppo
  • Architettura Intel® Many Integrated Core
  • Thread local calculation of reducers?

    Hi,

    I wonder how reducers work internally. So if a value is set into a reducer, does it block other threads each time a value is set?

    I ask because normally I'm creating a local 'reducer', e.g. a local histogram on an image tile and on leaving the thread all the data is pushed at once into the global reducer. Just like local memory operations in OpenCL.

    Iscriversi a Intel® Parallel Composer