Anamolous behavior of Intel MKL.

Anamolous behavior of Intel MKL.

I am using Intel MKL routine zgemm() to multiply two complex matrices
on a 2-core processor machine with a clock speed of 2.79 GHz

When I run the program with no OMP_NUM_THREADS and KMP_AFFINITY not
set, I am getting approximately 2700 MLFOPS. When I set
OMP_NUM_THREADS=2 and set KMP_AFFINITY= (null), my program's FLOPS go
down to 1390 MFLOPS. When unset KMP_AFFINITY FLOP rate goes down even
further to 1000 MFLOPS.

Why is the single thread code running better than when I specify two threads?

TIA

1 post / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.