Trying to understand AFREQ results in Intel "pcm" tool output, strange lmbench performance

Trying to understand AFREQ results in Intel "pcm" tool output, strange lmbench performance

I've got a system running dual Xeon E5-2648L (Sandy Bridge EP). I'm running linux 2.6.27 and have APCI processor support compiled in. With the stock setup, when idle running pcm for 10 seconds gives something like:

Core (SKT) | EXEC | IPC | FREQ | AFREQ | Inst | Cycles
0 0 0.062 0.881 0.070 0.862 1118 M 1269 M

When running lmbench's memory latency test on core 0, I get something like:

Core (SKT) | EXEC | IPC | FREQ | AFREQ | Inst | Cycles
0 0 0.292 0.292 1.000 1.000 5336 M 18 G

But when I run the memory latency test on core2 (hyperthread sibling of core 0), I get something like:

Core (SKT) | EXEC | IPC | FREQ | AFREQ | Inst | Cycles
2 0 0.354 0.511 0.691 0.697 6371 M 12 G

We don't have any frequency management stuff enabled in the OS, so it should be nailed to the nominal frequency of 1.8GHz. Core 0 looks good, shows 18G cycles in 10 sec. Core 2 (and all other even cores other than 0) only shows 12G cycles in 10 sec. This maps nicely to a FREQ of 0.691 (I'd have expected 0.667 but let's not quibble), but I don't understand why AFREQ is anything other than 1 given that we're not adjusting the clock rate.

Can anyone explain either the performance difference between core 0 and the other cores, or why AFREQ is smaller than expected?

Incidentally, if I run a cpu hog on core 0, then core 2 (and all other cores on socket 0) run at full speed and give 18G cycles.

publicaciones de 6 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.

The difference between FREQ and AFREQ is that FREQ measures the average frequency while AFREQ measures the average frequency while the core is not in a sleep state. In other words, FREQ and FREQ should coincide when the core is not going into sleep states. This doesn't explain why you have lower frequency and better IPC on core 2 than on core 0, but it explains why AFREQ<1.

In case you haven't seen this already, there is also a new tool pcm-power included in the packagethat let's you explore a lot of the power features in Intel Xeon processor E5 product family (Sandy Bridge EP).

Your description matches what I understand about FREQ/AFREQ.

An AFREQ value less than one means that my CPU is running at less than full speed when not in a sleep state. Given that I've got the frequency nailed at 1.8GHz and my OS is not configured to support any of the frequency-varying stuff like cpufreq, t-states, and such, this is highly unexpected.

Or am I missing something and there's another explanation?

Incidentally it's actally lower frequency and worse latency on all non-zero even cores (2, 4, 6, 8, etc) relative to core 0. And also for all non-zero odd cores (3,5,7,etc) relative to core 1. Basically the first hyperthread sibling in each socket seems to be treated specially somehow.

The data suggests that nailing the frequency to 1.8GHz did not work. it looks like you are in C0 but not in P1 state. You can cross-check the frequency bands with pcm-power. Maybe you missed a power setting somewhere.

Okay, here's what I've got:

I ran a cpu hog (just busy-loops) on cpu9. pcm gave

Core (SKT) | EXEC | IPC | FREQ | AFREQ | Inst | Cycles
9 1 1.314 1.851 0.710 0.726 23 G 12 G

I'm at a loss as far as figuring out why it's not running at full speed. Everything I can find says it should be giving full speed.

/proc/cpuinfo shows good results for cpu MHz (which should vary with actual speed):
processor : 9
model name : Intel Xeon CPU E5-2648L 0 @ 1.80GHz
stepping : 7
cpu MHz : 1799.991

Repeated reads to /proc/acpi/processor/CPU9/power show the sleep counts not increasing, which makes sense since it's totally busy.

/proc/acpi/processor/CPU9/limit shows:
active limit: P0:T0
user limit: P0:T0
thermal limit: P0:T0

/proc/acpi/processor/CPU9/throttling shows the active state as T0.

"pcm-power.x 10 -p 0 -a 0 -b 12 -c 18" gives results that are a bit confusing, since I would expect only one socket to be running at full speed at all.

S0; PCUClocks: 8001175924; Freq band 0/1/2 cycles: 93.84%; 93.84%; 16.21%
S1; PCUClocks: 8001180611; Freq band 0/1/2 cycles: 94.49%; 94.49%; 17.50%

pcm-power.x 10 -p 1 shows 8 cores (out of 8) on each socket in C0. This is a bit unexpected since only one cpu is actually doing anything, the others are idle.

S0; PCUClocks: 8001546374; core C0/C3/C6-state residency: 7.99; 0.00; 0.00
S1; PCUClocks: 8001606837; core C0/C3/C6-state residency: 8.00; 0.00; 0.00

pcm-power.x 10 -p 2 shows no clipping:
S0; PCUClocks: 8001579072; Internal prochot cycles: 0.00 %; External prochot cycles:0.00 %; Thermal freq limit cycles:0.00 %
S1; PCUClocks: 8001589576; Internal prochot cycles: 0.00 %; External prochot cycles:0.00 %; Thermal freq limit cycles:0.00 %

pcm-power.x 10 -p 3 shows no clipping:
S0; PCUClocks: 8001591572; Thermal freq limit cycles: 0.00 %; Power freq limit cycles:0.00 %; Clipped freq limit cycles:0.00 %
S1; PCUClocks: 8001616545; Thermal freq limit cycles: 0.00 %; Power freq limit cycles:0.00 %; Clipped freq limit cycles:0.00 %

We ran into exactly the same problem, and we had a lot of smart folks scratching their heads for most of a day.

The problem appears to be that processors are using p-states, but that the kernel does not have the appropriate module loaded to control them. (This does not explain how it manages to work on core 0 on each chip, but....)

We were able to fix this in a couple of ways
(1) /etc/init.d/cpuspeed start
loaded the acpi-cpufreq and associated modules and started the cpuspeed service.
(2) modprobe acpi-cpufreq
also fixed the problem

When we started the cpuspeed service it set the frequency governor to "ondemand", which we did not want.
so "/etc/init.d/cpuspeed stop" stopped the service, set the frequency governor to "userspace" and left the required kernel modules installed.

Once the required modules have been loaded you should be able to check the configuration by looking at the files in the /sys/devices/system/cpu/cpu*/cpufreq/ directories.
% cat scaling_cur_freq
% cat scaling_governor
% cat scaling_available_frequencies
% cat scaling_available_governors

On my system the available frequencies are:
2701000 2400000 2300000 2200000 2100000 2000000 1900000 1800000 1700000 1600000 1500000 1400000 1300000 1200000

I think that the unexpected digit in the 2701000 value means that the p0 state can run higher than 2.7 GHz via the Turbo mode mechanism.

I hope this helps!

John D. McCalpin, PhD "Dr. Bandwidth"

Inicie sesión para dejar un comentario.