Cilkplus port to Raspberry Pi 2B

I compiled the new release of gcc-5.1 with the Cilkplus parallel processing extensions and runtime library for ARMv7 architecture on the Raspberry Pi 2B single board computer.  Two changes were needed.

The first change corrects a typo in generic/cilk-abi-vla.c by changing the second to the last line of the file from

vla_internal_heap_free(t, full_size);


vla_internal_heap_free(p, full_size);

the second change was to generic/os-fence.c and ARM specific. Comment out the line

COMMON_SYSDEP void __cilkrts_fence(void); ///< MFENCE instruction

OpenMP spinning time


I am using a simple Merge Sort benchmark on the Xeon Phi. 78% of the total CPU time is consumed by ""

I tried to reduce the watsed time by the OpenMP runtime library by setting the "export KMP_BLOCKTIME=0". Please note that the application is running natively on the MIC. I have also tried "export OMP_WAIT_POLICY=passive". No effect!

Why this does not have any effect on the execution time or the wasted CPU time?

Thank you.

link error: libmkl_core.a depends on Open MPI (via libmkl_blacs_openmpi_lp64.a)

While invoking 2015.2.164 Intel icpc compiler, I have encountered a link error while linking against libmkl_core.a:

/opt/intel/composer_xe_2015.2.164/mkl/lib/intel64/libmkl_core.a(cpardiso_blacs_lp64.o): In function 'mkl_pds_lp64_cpardiso_mpi_barrier':

__work/lnx32e/_cpardiso/kernel/mpi_wrapper/cpardiso_blacs_lp64_h.f:(.text+0x6): undefined referece to 'MKL_Barrier'

No SITE annotations were encountered ?

Dear All,

I have just installed parallel studio 2016 and trying to use memory access pattern analysis with intel advisor. I have added annotation to source file like:

for (int i=0; i < nt.n_presyn; ++i) {


When I run the analysis with advisor I get:

calling flowgraph operators()() through shared library

Does anyone know if this is possible?...

To create a FlowGraph whereby each function is a class loaded from a shared library: probably Boost::extension to maintain portability between Windows & Linux.

Just a little concerned about:

- calls to the operator() to run the flow graph function

- any performance penalties this may incur

Any ideas?



Visual Studio 2013 crashes when attempting to build project

I'm using Visual Studio 2013 Update 4 on Windows 7 Pro with Intel VTune 2015 Update 2.  I've been using VTune and Visual Studio just fine for the past few days, but I've recently hit a massive roadblock.  Visual Studio will crash with a null reference exception whenever I try to build the working solution.

Here's the call stack error:

ippsDotProd_32f Performance on Haswell CPU


at the moment I'm using ippsDotProd_32f in IPP 7.0 quite extensively in one of my projects. I now tested IPP 8.2 on a Haswell CPU (Xeon e5-2650 v3 in a HP z640 workstation) with this project because I expected it to be significantly faster (see below). Actually, the code was about 10% slower using IPP 8.2 which I found quite disturbing.

Iscriversi a Threading