We are seeing some strange performance with the dgemm operator, which seems to depend upon the content of the source matrices. Content, not size. It is very strange, as if we use the same matrix sizes with randomly generated data (uniform or normal) or a constant value then everything appears fine and the timings are relatively consistent. The data doesn't seem too strange:
Max: 0.0997145, Min: -0.3362, Avg: -3.5246e-006
Most of the values hover close to that average, but with a few spikes clustered mostly in one area. What I can't understand is why that would have any effect on the performance, no matter what the values were. It is just a matrix multiplication, right?