I am currently running experiments on a 4 socket (Westmere EX) HP maschine. There, I am especially interested in QPI Link usage and utilization, which I try to measure with the Performance Counter Monitor. After having studied the source code I am puzzled about 2 events in particular (what they count and how they are used).
I am trying to understand the events FLT_SENT (Flit Sent) and NULL_IDLE (Null Idle Flit Sent). The tool uses e.g. FLT_SENT events together with InvariantTSCs to compute the maximum QPI Link Speed (cpucounters.cpp - PCM::computeQPIspeed(int core_nr)). That makes me think, that one flit (8 Bytes?) can be sent per cycle (TSC) and that per cycle one FLT_SENT event occurs?!
In another method in the tool (cpucounters.h getOutgoingQPILinkUtilization), NULL_IDLE Fluts are counted and used together with UncoreTSCs to estimate the QPI Link Utilization. I am totally puzzled how that works I was also pretty confused when I counted FLT_SENT and NULL_IDLE and realized that there were more NULL_IDLE events than FLT_SENT events.
Can anybody shed light on the question what FLT_SENT and NULL_IDLE events actually count. At the same time, I would be interested in the difference between InvariantTSC and UncoreTSC - both used to compute different measures (see above).
Thanks a lot!