I'm using IPP 8.1, in its 32 & 64bit forms, as DLLs (which I didn't compile myself, but the one who did guarantees that they are both the same version, compiled the same way).
To my experience, IppsPhase_32fc (used on the output of a 1024 bands FFT, thus around 500 pairs), which is already a pretty CPU-expensive function, is nearly three times slower in the 64bit build.
I'm using many IPP functions, and I have not noticed much difference for other functions (64bit versions sometimes very slightly slower), so it's most likely not a problem of branching, at least not a global one. It's of course not a problem of precision either, we're talking about the same _32fc here.
Has anyone experienced the same?
I haven't tested yet if it was the same deal for ATan2.