Hi, We are doing a performance comparison of Ivy Bridge, Tesla, and APU for SGEMM. We found that the Ivy Brdige CPU and (integrated) GPU only reaches the 13% (29 GFLOPS) and 33% (49 GFLOPS) of the theoretical peak performance respectively, and we would like to figure out the sources of inefficiencies by looking at the assembly code.Is there a way to view the assembly code for Intel HD graphics 4000 in Ivy Bridge? We have tried Intel offline compiler, but it only gives us the CPU assembly and the intermediate LLVM code. Any advice is greatly appreciated.Yao
For more complete information about compiler optimizations, see our Optimization Notice.