I try to measure the bus utilization on a Xeon 5400 machine, which has 1333MHz FSB and DDR2-667, when I do simple memory copy with 8 threads (the machine has 2 processors and 4 core in each processor). The throughput of memcpy (from one large chunk of memory to another large chunk of memory) is 3000MB/s.
I use oprofile to measureBUS_TRANS_ANY.ALL_AGENTS andCPU_CLK_UNHALTED.BUS. As suggested by Intel Optimization reference manual, the bus utilization can be measured as followBUS_TRANS_ANY.ALL_AGENTS * 2 / CPU_CLK_UNHALTED.BUS *100. When I do it, I only get 66%.
If I do simple memory copy, I should be able to saturate memory bus, right? Why do I only get 66%? Which part goes wrong?
Thanks,
Da
Understand bus utilization
Hi,



