Intel® Cilk™ Plus

Parallel Implementation of preconditioned conjugate gradient and cholesky decomposition using Cilk Plus

 Hi All,

I want the source code of parallel preconditioned conjugate gradient and cholesky decomposition using Cilk. So if anybody has these implementations source code kindly share them. I need them urgently.


Abdul Jabbar. 



I understand cilk_sort is parallized across shared memory multicores? 

What sorting algorithm does it use: quicksort, insertions sort, heapsort, etc?

thank you

Interesting reduce (add) case with Array Notation: is it possible ?

I have the following code part already using CILK+ Array Notation constructions:

out[0:Length] +=
(p_in_gains[0]*p_ins[0].blks[0][0:Length] +
p_in_gains[1]*p_ins[1].blks[0][0:Length] +
p_in_gains[2]*p_ins[2].blks[0][0:Length] +
p_in_gains[3]*p_ins[3].blks[0][0:Length] +

New with Cilk .... need a little help


I have this application which according to Intel advisor should get about 3.5 of speedup using 4 threads.

Using OpenMP I was able to get only ~ 2.35 with 2 and 2.85 with 4 threads.

Now I am learning and applying Cilk to see if I can improve this performance.  My compiler is icc (ICC) 14.0.0 20130728.

I am using only cilk_for (similar to what I did with openMP). Running the application with one processor I get the same performance with Cilk and openMP. However adding more processors hurts the Cilk performance big time.

Variable-length arrays and CilkPlus

It seems that I can use OpenMP together with CilkPlus array notation on variable length arrays, but not _Cilk_for.  This is under the Intel compiler with build  I get messages like:

junk.cpp(10): error: a variable captured by a lambda cannot have a type involving a variable-length array
              a[j][i] = (a[j][i] - __sec_reduce_add(a[0:j][i]*a[0:j][j])) / x;

Array notation and parallelisation

Does the Intel compiler currently attempt to parallelise array notation expressions?  If it does, I am failing dismally in persuading it to do so.  I use CILK_NWORKERS=4, and print both the wall clock and CPU.

If not, what would the recommended alternative be in any case where that were desirable?  To back off to loops and use OpenMP?

Cilkview bug


   I'm getting some suspicious cilkview output (pasted below).  Notice that the work calculated in the cilk parallel region is implausibly large.

I'm using:

Cilkview version 2.0.0, build 3566, built Aug 20 2013 13:58:06
Using PIN 2.12, Build 59761

gcc (GCC) 4.8.0 20120618 (experimental)

gcc -o qsort.64 -O3 -g -Werror -gdwarf-3 -std=gnu99 -m64 -I/afs/ qsort.c -lcilkrts -ldl

Iscriversi a Intel® Cilk™ Plus