Benchmarking code

Benchmarking code

Michael Conlen's picture

I'm trying to benchmark LAPACK operations on Linux and I'm getting a funny result. A simple program that performs a small dsyev() times at 0.205s real (I'm using real time on an unloaded system since user time becomes difficult to work with on a multi-core CPU). User time is 0.000 and sys is 0.004. If I perform the same operation a number of times the real time is constant up to about 610 iterations; though it's not consistant, sometimes 610 iterations produces real time of 0.205. At this point the user time is bout 0.800 seconds (4 cores).

As I continue to increase the count the real time is constant for about another 610 iterations where it suddenly jumps to 0.605 seconds.

Is there any reason particular to using MKL that this would be the case. I don't get the same type of result when linking against CLAPACK/ATLAS.

For reference

Linux beast 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

and I'm building with

$ gcc -O2 -L/opt/intel/composer_xe_2011_sp1.11.339/mkl/lib/intel64/ -L/opt/intel/composer_xe_2011_sp1.11.339/compiler/lib/intel64/ -I/opt/intel/mkl/include -c -o mklev.o mklev.c
$ gcc -O2 -L/opt/intel/composer_xe_2011_sp1.11.339/mkl/lib/intel64/ -L/opt/intel/composer_xe_2011_sp1.11.339/compiler/lib/intel64/ -o mklev mklev.o -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lm

The code is a slightly modified version of the example

lwork = -1;
dsyev("V", "U", &n, a, &lda, w, &wkopt, &lwork, &info);
lwork = (int)wkopt;
if((work = (double *)malloc(lwork*sizeof(double))) == NULL) { perror("malloc()"); goto error2; }
for(i=0; i dsyev("V", "U", &n, a, &lda, w, work, &lwork, &info);
if(info > 0) { printf("The algorithm failed to compute eigenvalues\\n"); goto error3; }
}

1 post / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.