Vector semantics less efficient that map one...

Vector semantics less efficient that map one...

Hi,

I report an a simple experiment because it took me some time to figure how to use multi-threading (OPT_ARBB_LEVEL=O3) with a simple kernel

which applies exp function to all elements of a vector.

The following vector kernel :

template

void exp_kernel(arbb::dense& X)

{

X = exp(X);

}

does not provide MT acceleration (with ARBB_OPT_LEVEL=O3) while the following one does:

template

void elementary_exp_kernel(T & Xi)

{

Xi = exp(Xi);

}

template

void map_exp_kernel(arbb::dense& X)

{

// X = exp(X);

arbb::map(elementary_exp_kernel)(X);
}

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

We did some investigation at ARBB_OPT_LEVEL=O3 and found that the runtime apparently does not spawn multiple tasks for the vector kernel (the exp_kernel routine in your example). But it does spawn multiple tasks for the map kernel (the map_exp_kernel routine in your example). This explains the difference in performance.

We think this is a bug. The two kernels should have had identical performance. We are working on it and will get it fixed in future releases.

Thanks for reporting the proble. We appreciate your interests in Intel ArBB. Please keep it going.

Zhang

Leave a Comment

Please sign in to add a comment. Not a member? Join today