Optimize embarrassingly parallel loops

Introduction

Algorithms that display data parallelism with iteration independence lend themselves to loops that exhibit ‘embarrassingly parallel’ code. Let us look at techniques to maximize the performance of such loops with minimal effort. One such example might be a loop to calculate the log of an array of variables.

Auto vectorization  

Intel®Composer can automatically detect loops that lend themselves to auto-vectorization. This includes explicit for loops with static or dynamic arrays , vector and valarray containers. Implicit valarray loops can either be auto vectorized or directed to invoke optimized Intel® Performance Primitives (Intel® IPP) library primitives. Please see section on how to enable Intel optimized valarray  headers.

The following example includes an explicit valarray and vector loops and an implicit valarray loop.

 

valarray<float> vf(size), vfr(size);
vector<float> vecf(size), vecfr(size);

//log function, vector,  explicit loop
for (int j = 0; j < size-1; j++) {
	vecfr[j]=log(vecf[j]);
}

//log function, valarray, explicit loop
for (int j = 0; j < size-1; j++) {
	vfr[j]=log(vf[j]);
}

//log function, valarray, implicit loop
vfr=log(vf);
	

 

Use Intel optimized valarray header

  • Add the following Command Line Additional option: /Quse-intel-optimized-headers.

use-intel-optimized-headers.png

  • Make sure that the Intel IPP libraries are selected in the ‘Build Component Selection’ dialog.

BuildComponentSelection.png 

 

 SelectCommonIPP.png

 

Limitations

Algorithms that do not lend themselves to iteration independence require fine grained parallelism. See this link for additional information.

Currently valarray calculations that result in implicit temporary arrays incur a severe penalty even with the Intel IPP interface. An example might be an equation such as y = (b^2 – 4*a*c) / 2*a, where a,b,c and y are valarray components. It is recommended that in such cases you should not use valarray.

AttachmentSize
Download v1.cpp1.71 KB
For more complete information about compiler optimizations, see our Optimization Notice.

Comments

's picture

/Quse-intel-optimized-headers doesn't work under Visual Studio 2005 x64, Windows Server 2003 x64 as
a linker cannot find 32-bit libraries. I've made a search for them and failed. Only 64-bit libraries are available at IPP folder.
Thanks in advance.

Jennifer J. (Intel)'s picture

Did you install the Intel Parallel Composer's intel-64 package only?