Stream benchmark performance

jimdempseyatthecove
Total Points:
36,397
Status Points:
36,397
Black Belt
October 17, 2008 12:01 PM PDT
Rate
 
#2
Quoting - gkli

I ran McCalpin's stream benchmark on the 3.4 GHZ Xeon and got 4759 MB/s for Triad.   The original code was inlined and the operations were directly on the global arrays.   I added a similar function but instead of directly using the global array, I passed in the arrays.  The bandwidth for the new Triad function was only 3400 MB/s.  Why did I lose so much bandwidth?    The opt-report indicated both functions were inlined.  

#define N  2000000

static double *   a          ;

static double *   b          ;

static double * c          ;

 

int main()

{

...

 

      times[3][k] = mysecond();

        tuned_STREAM_Triad(a,b,c,scalar);  // original function, inlined  4759 MB/s

 

      times[3][k] = mysecond() - times[3][k];

 

      times[4][k] = mysecond();

        tuned_STREAM_Triad_Arg(a,b,c,scalar);  // new function, inlined  3400 MB/s

      times[4][k] = mysecond() - times[4][k];

     

      return 0;

}

 

void tuned_STREAM_Triad(double* aa,double* bb,double* cc,double scalar)

{

      int j;

#pragma omp parallel for

      for (j=0; j<N; j++)

          a[j] = b[j]+scalar*c[j];

}

 

void tuned_STREAM_Triad_Arg(double* restrict aa,double* bb,double* cc,double scalar)

{

      int j;

#pragma omp parallel for

#pragma ivdep

      for (j=0; j<N; j++)

          aa[j] = bb[j]+scalar*cc[j];

}

 

Compiled with

icc -openmp -restrict

 

 

 

 

A couple of comments:

First, can we assume the proper amount of memory was allocate for a, b, c? (N doubles each)

Second, you might check the dissassembly code to see if register pressure cause the pointers to one or more of aa, bb, cc to be refetched from memory as opposed to remaining cached.

Jim Dempsey


--------

Blog: The Parallel Void


www.quickthreadprogramming.com


Intel Software Network Forums Statistics

8473 users have contributed to 31604 threads and 100653 posts to date.
In the past 24 hours, we have 31 new thread(s) 110 new posts(s), and 163 new user(s).

In the past 3 days, the most popular thread for everyone has been gemm(A,A,A) like possible? The most posts were made to gemm(A,A,A) like possible? The post with the most views is Dear Steve, excuse me for a d

Please welcome our newest member Kevin Johnson