An official example of user-defined reducer which gives a weird result

An official example of user-defined reducer which gives a weird result

I'm new comer in cilk. I've tried an example from official documentation http://software.intel.com/sites/products/documentation/hpc/composerxe/en-us/2011Update/cpp/lin/index.htm#cref_cls/common/cilk_bk_using_cilk.htm -> Reducers -> Advanced Topic: How to Write a New Reducer, where describes a way to create "thread local storage" by using reducer which benefits one of its advantages: once the steal occurs, a new view will be created and thus this "view" can served as a thread local variable. 

The definition of class "point" and "point_holder" is in the file "holder.h" and the main in "reverse.cpp". This little program aims at reversing an array of point elements, using a temporary variable "temp", which is defined by reducer, while swapping values. 

I'm using icc version 13.1.0 to compile my program. 

Here comes the problem. When I run the program even with only one worker(OS thread) and a very small size of array, the first array element's temp, its valid_ is still false leaving the loop cilk_for,  which causes the x() and y() methods yield both a result of -1 and leads to the last array element gets a point of (-1, -1). However, it becomes "true" right away. I think that means the compiler performs some optimization so that the "valid_=true" in the set() method is not executed right after the assignment of x_ and y_ as it should be in the serial semantic. Or this is due to the cilk runtime system which somehow affects the behavior of reducer even with only one strand (no extra worker to steal). 

I'm heartly looking forward to hear more comments about this issue. You can try my code which could be found in the attachments. 

附件尺寸
下载 reverse.cpp992 字节
下载 holder.h1.06 KB
4 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

Hello,

this seem to be a regression to Intel(R) Composer XE 2011 Update 13 (12.1) -- all 13.x versions up to the recent version 13.1.1 are affected. It only occurs at higher optimization levels (-O2 and -O3).
I've filed a defect ticket and let you know as soon as it is fixed.
A workaround would be to use Intel(R) Composer XE 2011 Update 13 for the time being.

Best regards,

Georg Zitzlsberger

FWIW is your code written correctly for what you intend?

As written, c is undefined outside cilk_for (it is declared but not initialized), and as executed it becomes the last value of the last thread through its last iteration (if any). Note, this is not (necessarily) the mid point cell n the array when cilk_for in effect.

Jim Dempsey

www.quickthreadprogramming.com

Hello,

it has been fixed for the future Intel(R) Composer XE 2013 SP1 (14.x) released end of this year. Unfortunately engineering cannot fix it for the current 13.x releases.

Best regards,

Georg Zitzlsberger

发表评论

登录添加评论。还不是成员?立即加入