Statistics About QPI

Statistics About QPI

Hi there,I am working on a project related to QPI. We need to collect some statistics. There are two CPUs (CPU A and B) connecting to each other with a QPI. Each CPU has direct accesses to a RAM, a SSD and a Niantic. It is possible that CPU A wants to access RAM B which connects to CPU B. The data path is:CPU A => CPU B (through QPI) => RAM B.The statistics we need is: Time{CPU A access RAM B} / Time{CPU A access RAM A}.There are some other statistics that we are interested for this topology setup, but basicall the above example shows what we need. We want to compare the resource access over QPI with the direct access.I am just wondering did Intel tested these performance before? Are their any results related to this that we can utilize? Thanks for you time and help!Regards,Ye

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Yes it is possible that CPU A will need to access memory connected to CPU B (remote memory access). The performance delta between local memory access and remote memory access is benchmarked at Intel for both latency and bandwidth and since usage models vary, (some) BIOS versions have options to modify the resource allocations for QPI. If you are configured for "non-numa" where 50% of all accesses will be to remote memory then you may benefit in changing the QPI resources for more remote credits.
Have you tried PTU? If may have what you need for basic statistics on the QPI link between the CPUs.

http://software.intel.com/en-us/articles/intel-performance-tuning-utility/

http://software.intel.com/en-us/articles/optimizing-applications-for-numa/

Ye,

The latency depends a lot on processor type, platform and memory. Therefore, it is difficult to provide general numbers. However, it is fairly simple to measure the latencies on your system:

LMbench (available here) contains a microbenchmark "lat_mem_rd"to measure the memory latency. With the tool "numactl" (part of libnuma), you can use it to measure the latency of the local and remote memory on your system:

numactl --cpunodebind=0 --membind=0 ./lat_mem_rd -t 1024

numactl --cpunodebind=0 --membind=1 ./lat_mem_rd -t 1024

Kind regards
Thomas

Hi everyone, 

I am working on a project related to QPI too. There are two CPUs (CPU A and B) connecting to each other with a QPI. Each CPU has direct accesses to a RAM, a SSD and a PCI card. It is possible that CPU A wants to access PCI card B which connects to CPU B.

We want to compare the resource access over QPI with the direct access. Beside Intel PCM, is there any other way to do that? It seem that the PCM did not give up the correct reading.

Come On !!! Do the research !!!

Hi everyone,

I am working on a project related to QPI too. We need to collect some statistics of the QPI.

There are two CPUs (CPU A and B) connecting to each other with a QPI. Each CPU has direct accesses to a RAM, a PCI card. It is possible that CPU A wants to access PCI card B which connects to CPU B. We want to compare the resource access over QPI with the direct access too. Beside Intel PCM, is there any way to measure that? 

Thank you very much. Looking forward to your reply.

Come On !!! Do the research !!!

Leave a Comment

Please sign in to add a comment. Not a member? Join today