Why P scales as C*V^2*f is so obvious (pt 2)

THE GORY DETAILS

Let’s continue from where we left off last time. Let’s figure out the why of the equation,

P = C * V^2 * (a * f)

To do this, we’re going to have to look at what is going on in one of the fundamental building blocks (a CMOS inverter) of an integrated circuit (IC).

So when and how does this circuit dissipate power?

Before getting into the math, let’s get our variables right.

Vdd is the voltage across the gate

Ipeak is the peak short circuit current going through the gate when it switches state (0 to 1 or 1 to 0)

Ileakage is the current through the gate even when it is reverse biased (i.e. in a 0 or a 1 state)

CL is the capacitance of one transistor

ts is the switching time needed to change the state of the switch

fg=1/Tg is the maximum rate that the gate can cycle at in our processor. In other words, it is the gate’s clock frequency.

Let’s start out with the basic CMOS gate, see above. There are three current paths, one going through the gate, one charging the capacitance of the gate, and one resulting from leakage through a reverse biased gate.

The one going through the gate results from the brief time that both semi-conductor transistors are closed causing a short circuit. In an ideal world, the switch would be instantaneous and there would be no current flow, and hence no power loss. But this isn’t a perfect world. There is a brief period of time, the switching time ts, when we’ve a short circuit. The power is going to be the voltage across the open circuit, Vdd, multiplied by the current, Ipeak. (We’re using the peak current since this is going to give us an upper bound on the power dissipated.) Say the open circuit exists for ts. Then the total energy lost is bounded by

Energy loss due to open circuit <= Vdd* Ipeak * ts

Let’s now look at the energy lost resulting from leakage through a reversed bias gate. Since ts is small compared to Tg we can approximate the energy loss as,

Energy loss due to reverse biased gate ~ Vdd*Ileakage*Tg

What about the total energy loss? If the energy loss due to the reverse bias leakage and short circuit current are so small, where is all that energy coming from that is heating our processor?

To get this, we need to look more closely at what a gate looks like in an analog sense. A reverse biased transistor is basically a capacitor, that is, two plates separated by an insulator / dielectric. From the figure above, it’s CL.A forward biased transistor is a short. These plates charge and discharge like a capacitor because of the design of a gate. In the one state, one transistor is “open” and the other is the acting capacitor. In the other state, the roles reverse and other transistor is the acting capacitor. What I’m trying to say is that even though the circuit is essentially open, current still flows from one transistor / capacitor to the other. This current flow is going to cause resistive heating, and so consume power.

The equation for the energy stored in a capacitor (C) is

Energy in a capacitor = ½ * C * V2

At each transition, the capacitor dumps the energy stored in it to either to ground or to the other complement transistor, giving us the following.

Energy flow due to a state transition = ½ * CL * Vdd2

Remember that one cycle has two state transitions. So the complete equation for the energy loss caused by one cycle, which we’ll call Etr, is,

Etr=CL*V2dd+ 2*Vdd*Ipeak*ts+Vdd*Ileakage*Tg

I’m now going to do something that is blatantly wrong.  It’s also a huge topic on its own that I’m completely unqualified to talk about. I’m going to ignore the last two terms. I’m pretty sure that they were 2nd order in effect maybe 10 years ago. And the first, the one with Ipeak, may still be.  But that last term, oh that last term. Volumes have been written about it, generally in a language incomprehensible to us mere mortals. (Request: can anyone out there tell us more?)

After dropping those last two terms, we’re left with,

Etr~ CL*V2dd

This is almost what we want. We’re missing that annoying little “f”. The little equation we wrote above is for one cycle of a gate. True, modern processors can do one heck of a lot in one cycle, but a one cycle application is still pretty uninteresting. Our gates above are switching all the time at a rate related to the frequency of the processor, which we’ll call “a * f”, where f is the frequency of the processor and a is some constant.

Energy output of a gate/sec ~ CL * Vdd2 * (a*f)

And how many gates are in a high-end Intel processor today? Close to a billion for 45 nm. (And the next generation is 32 nm.) So we’ve 1.0E9 (1 billion) transistors per processor, running at frequencies of 3E9 Hz (3 billion). Let’s see, 1E9*3E9 is – scientific notation always confuses me – 3E18 transitions per second. Is there even a name for 1E18?

Energy output of a processor/sec ~ CL * Vdd2 * (a*f) * <number of transistors>

Now before those of you who actually know this stuff start crabbing, let me make it clear that I am only attempting to help people understand where the equation comes from. Yes, the effect of the short circuit current contributes noticeably to the processor’s heating when we’re talking about, say, a billion transistors. And there’s that aforementioned leakage current. And there are the leakage and voltage issues related to smaller and smaller junctions. And there are a lot of circuit elements that aren’t strictly logic gates that contribute to the power. And there have been a lot of developments related to reducing the short circuit and leakage currents of a logic gate. And yes, I’m an ignorant software hack.

But for those of you who are neither processor architects nor researchers into modern IC materials, mayhap this gives you a little better understanding of where this (I hope) formerly mysterious relationship comes from.

Of course, I could be just blowing smoke, but then, one of you less than gentle readers out there will let me know.

 

For more complete information about compiler optimizations, see our Optimization Notice.

5 comments

Top
anonymous's picture

If you are interested in CPU power consumption, you may be interested in this too: http://www.awardfabrik.de/forum/showthread.php?t=6894
I am sorry that it is an outdated processor and the variable efficiency of the Voltage-Regulator-Module is included in the measurement.

anonymous's picture

Hi,

thank you for that article Taylor Kidd! I really didn't expect such a technical article in a software blog!
I interested in processor manufacturing and I read a lot of stuff about it, but as you may know it's not always that easy to get the right information.

Actually it is very hard to get detailed and relevant information about this subjects at first hand, so i think it's best to compare processors on the market in power consumption in idle and load and at different voltages to get a clue of how stuff COULD work.

I recently got some test results of some 65nm C2Q6600 and there were clearly visible differences among them; if we take the specific Voltage-ID of every processor as a "performance" index this observations could be done:
- CPUs with a low VID have a higher power consumption (49% higher power consumption for 7,5% lower VID in my case) in idle than CPUs with the higher VID. => lower VID has to do somehow with higher leakage (and/or energy loss due to open circuit and/or any other not load-state-influenced energy losses) - [is the VID determined by power consumption?] => the power dissipation caused by leakage (...) is well measurable!
- At load, low and high VID CPUs have nearly the same power consumption(at stock speed). => Most part of the power consumption at load is caused by switching gates on/off, so the influence of the leakage has a low percentage. (average 3,6% higher power consumption)
- Overclocking and overvolting the CPUs with a high and a low VID obviously increased the differences.

If we continue shrinking the structure size of microprocessors without new ideas and innovations like High-K, the influence of power dissipation by leakage will grow for sure. (But I've got confidence in Intel :-) ... )

Of course all I am saying is not confirmed, I'm only a student guessing about the most complex manufactured products on earth.

Taylor IoT Kidd's picture

Well, we shouldn't really drop the last two terms when considering modern processors. At one time, they weren't that significant. At the scales of today’s processors, the reverse bias leakage current becomes a very significant factor. I can’t really speak to what’s been happening to the short circuit current. From what I read, that’s the reason for the high-k hafnium dielectrics. (I’ve talked to individuals who actually know about these things. They usually tell me something like, “Yah, sure, I can give you the equations, but then I’d have to kill you.”) I’m going to scour the open literature so I can write something a little more informative about this topic. For a more informative than usual marketing description, you can look at http://www.intel.com/pressroom/archive/releases/20070128comp.htm. RE: dropping leakage and short circuit: Just looking at the equations, what you say is correct. But there are other constraints we didn’t talk about that come into play at some point. For example, what is the new switching time for these new transistors? How about the effect of the new metal gate? Are there new significant non-linear affects with these new materials and at the new scales? RE: one quintillion: From Wikipedia, “The highest numerical value banknote ever printed was a note for 1 sextillion pengő (1E21 or 1 milliard bilpengő as printed) printed in Hungary in 1946.” And I thought that there was no practical use for naming such large numbers.

terry100's picture

So, basically we drop the last two terms because they are insignificant in magnitude, right? Am I missing something? As it seems if someone wanted to lower the enrgy output of a processor, he would either have to lower the capacitance, the gate voltage or the clock. Which one of those is currently on its lower limit, if any?

Btw, 1E18 is a quintillion. (http://en.wikipedia.org/wiki/Names_of_large_numbers). Although I suspect you already know...

anonymous's picture

First off thanks for the article.

So, we are dropping the last two terms because they are insignificant in magnitude right? Am I missing something? In order for someone to lower the energy out of a gate he either has to lower the capacitance, the gate voltage or the clock? Which one(s) of them is currently on its limit?

Btw, 1E18 is a quintillion. (http://en.wikipedia.org/wiki/Names_of_large_numbers)

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.