I have compiled a SPEC FP 06 using the Intel 14.0.0 compiler suite. I've observed great performance but upon looking at the code gen distributions through SDE, I note that only about 0.1% of the instructions executed are FMA3. When I've compiled with Open64 in the past, I noted that 7% of the instructions executed were FMA variants, and between compiling with and without FMA3, the performance increased 5% approximately. I'm using the -xCORE-AVX2 compiler flag upon my Haswell, but it's not "efficienctly leveraging" the use of FMA3. Is there another flag I must use in order to get the Intel 14.0.0 compiler to generate FMA instructions? I'm quite confident there's opportunity missed here and wanted to bring it to someone's attention.
I posted this in this form because it's an ISA issue in the compiler and not isolated solely to the C or Fortran compilers.