(update) C-states, C-states and even more C-states

As I said before, a C-state is an idle state. The processor isn't doing anything useful, so why not shut some things off? Think of it in terms of your house. If you're not at home, why keep the lights, radio, and those 6 televisions going? Modern processors have several different C-states representing increasing amounts of "stuff" shut down. C0 is the operational state, meaning that the CPU is doing useful work. C1 is the first idle state. The clock running to the processor is gated, i.e. the clock is prevented from reaching the core, effectively shutting it down in an operational sense. C2 is the 2nd idle state. The external I/O Controller Hub blocks interrupts to the processor. And so on with C3, C4, etc. I'll discuss this further down in this paper. By the way, there is nothing preventing the OS from busy waiting in its idle state, and thus keeping the processor in C0, as did older operating systems. From the OS's standpoint, the processor is idling; it's just chewing up energy for no useful reason other than being an ineffectual heater.

So what's this thing about "C-states, C-states and even more C-states"? It turns out that there are different kinds of C-states depending upon what part of your system you are talking about. There are core C-states, processor C-states, and OS C-states. All are similar and are idle states (I'm excluding C0, of course.) They are also different in some substantial ways.

A core C-state is a hardware C-state. There are several core idle states, e.g. CC1 and CC3. As we know, a modern state of the art processor has multiple cores, such as the recently released Core Duo T5000/T7000 mobile processors, known as Penryn in some circles. What we used to think of as a CPU / processor, actually has multiple general purpose CPUs in side of it. The Intel Core Duo has 2 cores in the processor chip. The Intel Core-2 Quad has 4 such cores per processor chip. Each of these cores has its own idle state. This makes sense as one core might be idle while another is hard at work on a thread. So a core C-state is the idle state of one of those cores.

A processor C-state is related to a core C-state. At some point, cores share resources, e.g. the L2 cache or the clock generators. When one idle core, say core 0, is ready to enter CC3 but the other, say core 1, is still in C0, we don't what the fact that core 0 is ready to descend into CC3 to prevent core 1 from executing because we just happened to shut down the clock generators. Thus we have the processor / package C-state, or PC-state. The processor can only enter a PC-state, say PC3, if both cores are ready to enter that CC-state, e.g both cores are ready to step into CC3. I'll talk more about this in a subsequent section.

A logical C-state: The last C-state is the OS's view of the processors' C-states. In Windows, a processor's C-state is pretty much equivalent to a core C-state. In fact, the OS's lower level power management software determines when and if a given core enters a given CC-state using the MWAIT instruction. There is one important difference. When an application, such as Intel's PowerInformer, thinks it's interrogating a processor core CC-state, what is returned is the C-state of what is called a "logical core". (A logical core is technically not the same as a physical core. In my experience, a logical core is almost always the same as a physical core, but it doesn't have to be.) Logical cores don't have to worry about little things such as the hardware the OS is running on. For example, the C-state of a logical core doesn't worry about the barriers imposed by shared resources, such as the clock generators, I talked about earlier. Logical Core 0 can be in C3 while Logical Core 1 is in C0.

This seems a little confusing doesn't it? So how do logical core C-states, core C-states and processor C-states relate to each other? Take the situation above: From the OS perspective, logical core 0 is in C3 and logical core 1 is in C0. Since C3, from the hardware perspective, actually shuts down a shared process, the clock generators, (physical) core 0 must be held at CC2 since core 1 is in C0 and using the clock generators. The processor, in a global sense, is not idle since core 1 is in C0, so the processor's C-state is C0. To use a little bit of that intimidating mathematics,

Processor C-state = Min(core C-states)


Core C-state = Minimum barrier(set of all logical C-states)


Logical C-state = anything the OS wants



Next: There has got to be a catch
有关编译器优化的更完整信息,请参阅优化通知