I have noted in multiple (though infrequent but freqent enough) circumstances that the instruction counts for execution of a binary in SDE and that reported by PMC 0xC0 differ by ORDERS of magnitude. I just ran a version of hmmer compiled with Intel 14.0 and SDE is reporting to me (v5.38 of SDE and it's run with sde -mix -top_blocks 3000 upon a Haswell system) that hmmer took 60 Trillion instructions to execute. I know that number is bogus since in Open64 it only took 1.05 Trillion to execute as reported by the PMCs. "perlbench" is also reporting it took 5.6 T instructions to execute in SDE, but likewise the PMCs on HW reported that only 2.1 T instructions were executed in Open64. I don't think Intel14 is taking more than 2x the # of instructions. Something is antenuating the instruction counts reported by SDE in both "perlbench" and in "hmmer". I've also noted this happening in GCC v4.6 in the same benchmarks.
Does anyone have an idea as to what's happening in SDE and why I'm observing these bloated instruction counts. I thought I'd report it.. so as to make you aware of the issue and seek a solution.. thanks..