In trying to profile the cache performance of an application and noticed something strange in the Vtune results.
vpshufd instructions seem to have positive values for L2_DATA_READ_MISS_MEM_FILL when the source and destination operands are registers.
Address Source Line Assembly L2_DATA_READ_MISS_MEM_FILL CPU_CLK_UNHALTED
0x407afa 367 vpshufd $0x44, %zmm26, %k0, %zmm27 1,600,000 24,000,036
I noticed this statement about this event in the KNC PMU events reference "Can include promoted read misses that started as CODE accesses"
Is this likely to be the reason for this? If so, what does it actually mean?