DGEMM from MKL performs its operations in slightly different orders on different chips. For example, I get slightly different numerical behavior from the same program on a Q9450 and an i7. This is presumably because it is dynamically choosing the kernel, cache blocking, etc. to match the machine on which it is running. That can change the order of operations.
My question: is there any way to obtain the Q9450 behavior on an i7 chip? I understand it will probably run slower --- my only concern is producing exactly the same numerical result. I browsed the MKL environment variables in the documentation, but didn't find anything that would allow me to 'detune' the library.