Configuring RAPL limits in Sandy Bridge Xeon processors

Configuring RAPL limits in Sandy Bridge Xeon processors

Hi,

I am trying to set RAPL limits, so my question is for setting some power limits, do I need to go to each core and set the limits or just set limits in core 0 in a socket. each core I meant writing all the /dev/cpu/<N>/msr nodes in that socket?

Can we set different values in each of the cores in a socket.

-Nobin

28 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hello Nobin,

Have you read through section 14.7 of SDM vol 3? The scope of MSR_PKG_POWER_LIMIT is the package, so you should be able to set just the 1 msr from any cpu on that package. The MSR_PP0_POWER_LIMIT msr also has a package scope so it seems you can't set per core limits but I haven't read through all of section 14.7.

Pat

It seems that PP0 cannot be bound to specific core and it is used per whole processor.

what is the purpose of "time window power limit" field in MSR_PKG_POWER_LIMIT register, how it is used, In one of my sandy bridge systems i got the following reading.

MSR: 1556    Value: 13236883076154104
Max Watts Raw1760
Min Watts Raw432
Max time window Raw47
Max Watts 220.000000
Min Watts 54.000000
Max time window 0.742188

Raw values of units are(convert to 1/2^x for actual units)  :

Power units 3
Time units 10
Energy Status 16

The max time window is 0.742188secs, so my dump question is, how this field is used during RAPL operation.

My understanding is RAPL means the system will be bought down to low power state for some time, but the max time window specified is very low.

Hi Nobin,

please read this section 14.7.3 Package RAPL Domain

It may be also useful to look at the source code of Intel(r) Power Governor that can configure RAPL.

Thanks Roman,

Where I can find the source code of Intel Power Governor?

-Nobin

Roman,

I got it, I wil look into the source code.

-Nobin

Power_gov fails in my system to initialize, it gives "RAPL not supported, or machine model 306e2 not recognized.
Init failed!"

In BIOS

DRAM RAPL BWLIMIT =1 and 

DRAM RAPL (Running Average Power Limit ) Mode = DRAM RAPL Mode 1].

Any idea why power_gov fails?

I would ask the authors.

What is your cpu

It  is Sandy Bridge

Cat /proc/cpuinfo

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 62
model name      : Genuine Intel(R) CPU  @ 2.20GHz
stepping        : 2
cpu MHz         : 2201.000
cache size      : 25600 KB
physical id     : 0
siblings        : 20
core id         : 0
cpu cores       : 10
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
bogomips        : 4399.64
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

dmidecode:

Handle 0x0004, DMI type 4, 42 bytes
Processor Information
    Socket Designation: SOCKET 0
    Type: Central Processor
    Family: <OUT OF SPEC>
    Manufacturer: Intel
    ID: E2 06 03 00 FF FB EB BF
    Version: Genuine Intel(R) CPU @ 2.20GHz
    Voltage: 0.0 V
    External Clock: 100 MHz
    Max Speed: Unknown
    Current Speed: 2200 MHz
    Status: Populated, Enabled
    Upgrade: <OUT OF SPEC>
    L1 Cache Handle: 0x0005
    L2 Cache Handle: 0x0006
    L3 Cache Handle: 0x0007
    Serial Number: Not Specified
    Asset Tag: Not Specified
    Part Number: Not Specified
    Core Count: 10
    Core Enabled: 10
    Thread Count: 20
    Characteristics:
        64-bit capable
        Multi-Core
        Hardware Thread
        Execute Protection
        Enhanced Virtualization
        Power/Performance Control

-Nobin

It is Sandy Bridge

Hi Nobin, 

I updated the tool and will post it on the web-page in a day or two. The reason for the error was that I had not yet updated the machine codes for some of the new Ivy Bridge processor, so that check was failing. I update the tool, please check the download page again.  

http://software.intel.com/en-us/articles/intel-power-governor

Thank you

Martin

Martin

Processor is Sandybridge.

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 62
model name      : Genuine Intel(R) CPU  @ 2.20GHz
stepping        : 2
cpu MHz         : 2201.000
cache size      : 25600 KB
physical id     : 0
siblings        : 20
core id         : 0
cpu cores       : 10
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
bogomips        : 4399.64
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

Dmidecode:

Handle 0x0004, DMI type 4, 42 bytes
Processor Information
    Socket Designation: SOCKET 0
    Type: Central Processor
    Family: <OUT OF SPEC>
    Manufacturer: Intel
    ID: E2 06 03 00 FF FB EB BF
    Version: Genuine Intel(R) CPU @ 2.20GHz
    Voltage: 0.0 V
    External Clock: 100 MHz
    Max Speed: Unknown
    Current Speed: 2200 MHz
    Status: Populated, Enabled
    Upgrade: <OUT OF SPEC>
    L1 Cache Handle: 0x0005
    L2 Cache Handle: 0x0006
    L3 Cache Handle: 0x0007
    Serial Number: Not Specified
    Asset Tag: Not Specified
    Part Number: Not Specified
    Core Count: 10
    Core Enabled: 10
    Thread Count: 20
    Characteristics:
        64-bit capable
        Multi-Core
        Hardware Thread
        Execute Protection
        Enhanced Virtualization
        Power/Performance Control

Can you send a copy of it to me if it is possible, nobin.mathew@gmail.com.

Afaik Sandy Brigde cpuid starts with 0206xxh and Ivy Bridge code starts with 0306A9h.It seems strange that software failed to decode properly cpuid instruction.

root> ./cpuid
 eax in    eax      ebx      ecx      edx
00000000 0000000d 756e6547 6c65746e 49656e69
00000001 000306e2 00200800 7fbee3ff bfebfbff
00000002 76035a01 00f0b2ff 00000000 00ca0000
00000003 00000000 00000000 00000000 00000000
00000004 00000000 00000000 00000000 00000000
00000005 00000040 00000040 00000003 00001120
00000006 00000077 00000002 00000009 00000000
00000007 00000000 00000000 00000000 00000000
00000008 00000000 00000000 00000000 00000000
00000009 00000001 00000000 00000000 00000000
0000000a 07300403 00000000 00000000 00000603
0000000b 00000000 00000000 00000000 00000000
0000000c 00000000 00000000 00000000 00000000
0000000d 00000000 00000000 00000000 00000000
80000000 80000008 00000000 00000000 00000000
80000001 00000000 00000000 00000001 2c100800
80000002 20202020 20202020 20202020 20202020
80000003 756e6547 20656e69 65746e49 2952286c
80000004 55504320 20402020 30322e32 007a4847
80000005 00000000 00000000 00000000 00000000
80000006 00000000 00000000 01006040 00000000
80000007 00000000 00000000 00000000 00000100
80000008 0000302e 00000000 00000000 00000000

Vendor ID: "GenuineIntel"; CPUID level 13

Intel-specific functions:
Version 000306e2:
Type 0 - Original OEM
Family 6 - Pentium Pro
Model 14 -
Stepping 2
Reserved 12

Extended brand string: "                Genuine Intel(R) CPU  @ 2.20GHz"
CLFLUSH instruction cache line size: 8
Hyper threading siblings: 32

Feature flags bfebfbff:
FPU    Floating Point Unit
VME    Virtual 8086 Mode Enhancements
DE     Debugging Extensions
PSE    Page Size Extensions
TSC    Time Stamp Counter
MSR    Model Specific Registers
PAE    Physical Address Extension
MCE    Machine Check Exception
CX8    COMPXCHG8B Instruction
APIC   On-chip Advanced Programmable Interrupt Controller present and enabled
SEP    Fast System Call
MTRR   Memory Type Range Registers
PGE    PTE Global Flag
MCA    Machine Check Architecture
CMOV   Conditional Move and Compare Instructions
FGPAT  Page Attribute Table
PSE-36 36-bit Page Size Extension
CLFSH  CFLUSH instruction
DS     Debug store
ACPI   Thermal Monitor and Clock Ctrl
MMX    MMX instruction set
FXSR   Fast FP/MMX Streaming SIMD Extensions save/restore
SSE    Streaming SIMD Extensions instruction set
SSE2   SSE2 extensions
SS     Self Snoop
HT     Hyper Threading
TM     Thermal monitor
31     reserved

TLB and cache info:
5a: unknown TLB/cache descriptor
03: Data TLB: 4KB pages, 4-way set assoc, 64 entries
76: unknown TLB/cache descriptor
ff: unknown TLB/cache descriptor
b2: unknown TLB/cache descriptor
f0: unknown TLB/cache descriptor
ca: unknown TLB/cache descriptor
Processor serial: 0003-06E2-0000-0000-0000-0000

Sorry I am using IvyBridge, not Sandy Bridge.

10 core ones.

Sorry, It is Ivybridge  I am using.

-Nobin

Sorry I am using IvyBridge, not Sandybridge.

ok. I updated the tool to support your cpuid. I will email to you as well. 

thanks

Martin

Martin

Hi Martin,

Thanks for the power_gov.

I am using a Ivy Bridge server chip which has 10 cores ( I work for some company).

my RAPL understanding lacks some fundamental knowledge, can you answer the below question.

Lets think only about PKG domain.

We have two power limits in MSR_PKG_POWER_LIMIT register for PKG domain, say power_limit1/time_window1 and power_limit2/time_window2.

We just go ahead and set values in MSR_PKG_POWER_LIMIT register, after it is upto the system to determine how to use it(i.e. when/which limit).

How system is going to use it?

After enabling(with clamping) both, system power consumption will be limited between power_limit1 and power_limi2?

What is the importance time_window1 and time_window2? how it is used?

-Nobin

MSR_PKG_POWER_LIMIT

Processor family datasheets are much more helpfull, than system programmers manuals. Processor family datasheets has much clear documentation about RAPL.

Quote:

Nobin M. wrote:

Processor family datasheets are much more helpfull, than system programmers manuals. Processor family datasheets has much clear documentation about RAPL.

Can you point to some datasheets?

Thank you in advance

Intel® Xeon® Processor E5-1600/E5-2600/E5-4600 Product Families Datasheets Volume One

http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/x...

Thank you Nobin:)

Hello Martin,

I am trying to use the power governor tool on an Ivy Bridge server (Intel(R) Xeon(R) CPU E5-2680 v2), but I get:

PKG_0,Core_0,Graphics_0,Uncore_0,PKG_1,Core_1,Graphics_1,Uncore_1,
*** Error in `./power_gov': free(): invalid next size (normal): 0x0000000001d25770 ***

when I try to start the power meter.

Any ideas what could it be?

Thanks,

Vasileios.

Leave a Comment

Please sign in to add a comment. Not a member? Join today