If you are new to Cilk™ Plus, you have probably been impressed by how easy it is to turn a serial program into a parallel program. You’ve also realized, though, that adding cilk_sync or cilk_for to a program doesn’t automatically solve the harder parts of parallel programming: dealing with data races, and coordinating work that is done in parallel.
CilkPlus
Improving graphics processing performance using Intel(R) Cilk(TM) Plus
Author(s): Anoop Madhusoodhanan Prabha, Mark Sabahi
Intel® Cilk™ Plus – AOBench Sample
This is the AOBench example associated with the "Intel® Cilk™ Plus – The Simplest Path to Parallelism" how-to article. It shows an Ambient Occlusion algorithm implemented as serial loops, one using Intel Cilk Plus' cilk_for keyword to implement parallelism, one version using Intel Cilk Plus' array notations to allow vectorization for the SIMD instruction, and another version using both cilk_for and the array notations. It demonstrates great performance with very little coding changes through both data-parallelism and task-parallelism.
