| Published On : | October 28, 2009 1:00 AM PDT |
Rate |
|
Posted by Ilya Mirman originally on www.cilk.com on Thu, Apr 16, 2009
(The following post is from the engineers at our partner Sitrus, an outsourced development firm that offers multicore-enablement services for C++ applications using Cilk++.)
To evaluate Cilk++, we decided to implement several algorithms from different areas of numeric computation and to compare their performance with C++ equivalents. Four functions were implemented:
, then the value of the next point will be computed as:
, where
,
,
,
. The algorithms were initially implemented in C++ and then converted into parallel form using Cilk++'s cilk_for, cilk_spawn and cilk_sync keywords. The conversion was very easy to implement (the source code is available here). The Cilkscreen race detector was found to be very useful for debugging the parallel versions of the programs.
The tests were run on a Dual-socket Quad-Core AMD Opteron Processor 2347, running Linux opteron 2.6.27.5-117.fc10.x86_64 with 20GB RAM system.
The test programs were built using gcc version 4.2.4 (Cilk Arts build 7007) compiler.
All the tests were run 10 times, and the time of execution was measured using cilk++'s example_get_time() routine. Presented here are the average execution times of the serial C++ and multicore-enabled programs on 1, 2, 4, 8 cores.

Comparing previous results with Cilkscreen Parallel Performance Analyzer, we see that the actual performance values fall nicely in the middle of the speedup estimate range:







