Larger Test Data

BradleyKuszmaul
July 1, 2009 5:10 PM PDT
Rate
 
|Best Answer
#3 Reply to #2

Here is my performance on the 400,000 row dataset on a 2.2GHz AMD Opteron 8354:

cores  runtime  speedup  speedup/cores
 1     25.59s     1        1
 2     12.82s     1.99     .995
 4      6.50s     3.94     .985
 8      3.34s     7.67     .959
12      2.34s    10.96     .913
14      2.02s    12.67     .905
16      1.98s    12.92     .808
c++    23.08s     1.11          (the C++ code with no Cilk++ constructs)

The machine is busy running some other code, so only about 13 or 14 cores are available, hence the falloff in speedup/cores at the top end.

My 1.6GHz laptop is somewhat slower (about 42.9s on two cores). The clock rate differnce is much less than the speed difference, so I think this says the 16-core machine has a better memory architecture than my laptop.  (What kind of machine did you run on?)

The C++ code is about 11% faster than the Cilk++ code on one processor. I'm still looking into that.



Intel Software Network Forums Statistics

8488 users have contributed to 31627 threads and 100743 posts to date.
In the past 24 hours, we have 35 new thread(s) 136 new posts(s), and 196 new user(s).

In the past 3 days, the most popular thread for everyone has been gemm(A,A,A) like possible? The most posts were made to gemm(A,A,A) like possible? The post with the most views is Dear Steve, excuse me for a d

Please welcome our newest member chat1983