Stream benchmark performance

gkli
October 20, 2008 3:24 PM PDT
Rate
 
#3 Reply to #2

A couple of comments:

First, can we assume the proper amount of memory was allocate for a, b, c? (N doubles each)

Second, you might check the dissassembly code to see if register pressure cause the pointers to one or more of aa, bb, cc to be refetched from memory as opposed to remaining cached.

Jim Dempsey

Yes, the pointers were allocated with

a = (double *) malloc (N*sizeof(double)); 

likewise for b and c.

I ran the benchmark for N=2000000.  Would it be reasonable to assume that there is sufficient work in the loop that whether the pointer is cached would not significantly affect performance?

 



Intel Software Network Forums Statistics

8473 users have contributed to 31604 threads and 100653 posts to date.
In the past 24 hours, we have 31 new thread(s) 110 new posts(s), and 163 new user(s).

In the past 3 days, the most popular thread for everyone has been gemm(A,A,A) like possible? The most posts were made to gemm(A,A,A) like possible? The post with the most views is Dear Steve, excuse me for a d

Please welcome our newest member Kevin Johnson