Title: “Intel® Xeon Phi™ coprocessor Power Management Part 2b: Package C-States, The Details”

 TERMINOLOGY NOTE:

Upon reading the SDG (Intel® Xeon Phi™ Coprocessor Software Developer’s Guide), you’ll find a variety of confusing names and acronyms. Here’s my decoder ring:

Package Auto C3: also referred to as Auto-C3, AutoC3, PC3, C3, Auto-PC3 and Package C3

Package Deep-C3: also referred to as PC3, DeepC3, DeeperC3, Deep PC3 and Package C3 (No, I am not repeating myself.)

Package C6: Also referred to as PC6 and C6 and Package C6.

BACKGROUND: WHAT THE HECK IS THE “UNCORE”?

Before we dig deep into package C-states, I want to give you some background about circuitry on a modern Intel processor. A natural way of dividing up the circuitry of a processor is that composing the cores -- basically that supporting the pipeline, ALUs, registers, cache, etc -- and everything else (supporting circuitry). It turns out that “everything else” can be further divided into that support circuitry not directly related to performance (e.g. PCI Express* interfacing), and that which is (e.g. the bus connecting cores). Intel calls support circuitry that directly impacts the performance of an optimized application the “Uncore”.

 

 

Figure 0 Circuitry types on the coprocessor

 Since that is out of the way, let us get back to package C-states.

WHY DO WE NEED PACKAGE C-STATES?

After gating the clocks of every one of the cores, what other techniques can you use to get even more power savings. Here’s a trivial and admittedly flippant example of what you could do: unplug the processor. You’d be using no power, though the disadvantages of pulling the power plug are pretty obvious. A better idea is to selectively shutdown the more global components of the processor in such a way that you can bring the processor back up to a fully functional state (i.e. C0) relatively quickly.

Package C-States are just that, the progressive shutdown of additional circuitry to get even more savings. Since we have already shutdown the entire package’s circuitry associated with the cores, the remaining circuitry is necessarily common to all the cores, thus the name “package” C-states.

WHAT PACKAGE IDLE STATES ARE THERE?

My dear readers, there are 3 package C states: Auto-C3, Deep-C3, and (package) C6. As a reminder, all these are package C-states, meaning that all the threads/CPUs in all the cores are in a HALT state. I know what you are thinking. “If all the cores in the coprocessor are in a HALT state, how can the Power Management (PM) software (SW) run?” That’s a good question. The answer is obvious once you think on it. If the PM SW can’t run on the coprocessor, where can it run? On the host, of course.

 

Figure 1 Coprocessor and host power management responsibilities and control

 

There are two parts to controlling power management on the Intel® Xeon Phi™ coprocessor, the PM SW that runs on the coprocessor, and the PM component of the MPSS Coprocessor Driver that runs on the host. See figure 1. The coprocessor part controls transitions into and out of the various core C-states. Naturally, when it is not possible for the PM SW to run on the coprocessor, such as for package Deep-C3 and package C6, the host takes over. Package Auto-C3 is shared by both.

WHAT IS SHUT DOWN IN THE PACKAGE C-STATES?

I was going to rewrite this table but it is so clear, I am stealing it instead. It is Table 3-2 of the Intel® Xeon Phi™ Coprocessor Software Developer’s Guide (SDG).

Package Idle State

Core State

Uncore State

TSC/LAPIC

C3WakeupTimer

PCI Express* Traffic

PC3

Preserved

Preserved

Frozen

On expiration, package exits PC3

Package exits PC3

Deep C3

Preserved

Preserved

Frozen

No effect

Times out

PC6

Lost

Lost

Reset

No effect

Times out

 

And for those of you who want a little more detail:

Package Auto-C3: Ring and Uncore clock gated

Package Deep-C3: VccP reduced

Package C6: VccP is off (I.e. Cores, Ring and Uncore are powered down)

TSC and LAPIC are clocks which stop when the Uncore is shutdown. They have to be set appropriately when the package is reactivated. “PC3” is the same as the package Auto-C3 state.

HOW ARE IDLE PACKAGE C-STATE TRANSITIONS DETERMINED

Into Package Auto-C3: You can think of the first package state, Auto-C3, as a transition state. The coprocessor PM SW can initiate a transition into this state. The MPSS PM SW can override this request under certain conditions, such as when the host knows that the Uncore part of the coprocessor is still busy.

We will also see that the package Auto-C3 state is the only package state that can be initiated by the coprocessor’s power management. Though this seems a little unfair at first, upon further thinking the reason is obvious. At the start of a transition into package Auto-C3, the coprocessor SW PM routine is running and can initiate the transition into the first package state. (To be technically accurate, the core executing the PM SW can transition quickly out of a core C-state into C0 quickly)

Beneath Auto-C3, the coprocessor isn’t executing and transitions to deeper package C-states are best controlled by the host PM SW. Not only is this due to the coprocessor’s own PM SW is essentially suspended, but because the host can see what is happening in a more global sense, such as Uncore activity after all the cores are gated, and traffic across the PCI Express bus.

Into Package Deep-C3: The host’s coprocessor PM SW looks at idle residency history, interrupts (such as PCI Express traffic), and the cost of waking the coprocessor up from package Deep-C3 to decide whether to transition the coprocessor from a package Auto-C3 state into a package Deep-C3 state.

Into Package C6: Same as the Package Deep-C3 transition but only more so.

REFERENCES

For those of you with a passion for power management, check out the Intel® Xeon Phi™ Coprocessor Software Developer’s Guide. It has state diagrams and other goodies. I recommend sections 2.1.13, “Power Management”, and all of section 3.1, “Power Management (PM)” for your late night reading.

NEXT: AN INTUITIVE DESCRIPTION OF POWER STATES USING STICK FIGURES AND LIGHTBULBS

REFERENCES

Kidd, Taylor, "Intel® Xeon Phi™ coprocessor Power Management Pt 0: Introduction and inquiring minds," Intel(r) Corporation, March 24th, 2013. http://software.intel.com/en-us/blogs/2013/03/24/intel-xeon-phi-coprocessor-power-management-pt-0-introduction-and-inquiring-minds

Kidd, Taylor, "Intel® Xeon Phi™ coprocessor Power Management Part 1: P-States, Reducing power consumption without impacting performance," Intel(r) Corporation, May 15th, 2013. http://software.intel.com/en-us/blogs/2013/05/15/intel-xeon-phi-coprocessor-power-management-part-1-p-states-reducing-power

Kidd, Taylor, "Intel® Xeon Phi™ coprocessor Power Management Part 2a: Core C-States, The Details," Intel(r) Corporation, June 3rd, 2013. http://software.intel.com/en-us/blogs/2013/06/03/intel-xeon-phi-coprocessor-power-management-part-2a-core-c-states-the-details

For more complete information about compiler optimizations, see our Optimization Notice.