SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Semi-Equips - Buy when BLOOD is running in the streets!
LRCX 157.46-2.2%Oct 31 9:30 AM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: Sam Citron who wrote (10786)11/7/2003 3:45:26 PM
From: Sam Citron  Read Replies (1) of 10921
 
A Conversation with Dan Dobberpuhl - Part 1

From Power
Vol. 1, No. 7 - October 2003
Have we maxed out yet on microprocessor power? Two industry veterans discuss the trade-offs.

Introductions
The computer industry has always been about power. The development of the microprocessors that power computers has been a relentless search for more power, higher speed, and better performance, usually in smaller and smaller packages. But when is enough enough?

Two veteran microprocessor designers discuss chip power and the direction microprocessor design is going. Dan Dobberpuhl is responsible for the design of many high-performance microprocessors, including the PDP-11, uVax, Alpha, and StrongARM. He worked at Digital Equipment Corporation as one of five senior corporate consulting engineers, Digital's highest technical positions, directing the company's Palo Alto Design Center. After leaving Digital, Dobberpuhl founded SiByte Inc., later acquired by Broadcom. In an October 1998 article, EE Times named him one of "40 forces that will shape the semiconductor industry of tomorrow." He has written numerous technical papers and is coauthor of the text, Design and Analysis of VLSI Circuits, well known to a generation of electrical engineering students. Dobberpuhl is the named inventor on nine issued U.S. patents and has several more pending patent applications in various areas of circuit design.

David Ditzel directs our conversation with Dobberpuhl. Ditzel is vice chairman and chief technology officer of Transmeta Corporation, which he cofounded in 1995 to develop a new kind of computer—one that would learn how to improve its performance and save power as it ran, by using software embedded in the processor itself. Before founding Transmeta, Ditzel was director of SPARC Labs and chief technical officer at Sun Microsystems' microelectronics division. Ditzel came to Sun from AT&T Bell Laboratories in 1987, where he was the chief architect of the CRISP Microprocessor, AT&T's first RISC chip. His work first attracted industry-wide attention in 1980, when he coauthored "The Case for the Reduced Instruction Set Computer (RISC)."

DAVE DITZEL Dan, do you want to introduce yourself and say a couple of words about your professional background?

DAN DOBBERPUHL I recently founded a new fabless chip company called P.A. Semi. I have been in the industry for 36 years, and have been developing microprocessors since 1976. I have seen a lot of changes during that period in terms of both silicon technology and microprocessor development.

DITZEL You've got a wonderfully long history here. And our topic today is really about low power. Do you remember the very first computer you worked on, and can you take a guess at what kind of power that computer probably took?

DOBBERPUHL Well, it's interesting Dave. The power dissipation of the early MOS chips wasn't all that high, because the frequencies were so low and the chips were small and the transistor counts were very low. But they were typically in the range of three to five watts.

DITZEL In those days you probably used a minicomputer or mainframe to run your CAD tools, and there was a big difference between the performance and size of those development machines and the microprocessor chips you were developing.

DOBBERPUHL Sure. So when we developed the LSI-11, our processor design environment was basically a PDP-10, which was the time-sharing machine of the era. That was a large multi-rack system. And at the time the performance ratio was very high between those devices and the chips we were developing, in factors of 10 to 100. Over the next 10 years we brought that to equivalence and then basically the CMOS devices took over.

LSI-11
DITZEL So let's set the stage. The LSI-11 started about what year?

DOBBERPUHL The original LSI-11 started in the early 70s. It was designed jointly by engineers at Digital and Western Digital.

DITZEL And that had probably a MIPS rating. How many million instructions per second do you think that ran?

DOBBERPUHL I believe the clock rate on the original LSI-11 was about 1.3 megahertz. That was the microinstruction rate. I would guess that the machine was probably about 0.2 MIPS. The next generation, the LSI-11/23, bumped the clock rate all the way up to 3.3 megahertz.

DITZEL Another chip project that you're fairly famous for is the DEC Alpha chip. That seemed to take a somewhat different approach in terms of trying to drive up the performance, but you drove up the power at the same time. How much did the power go up from that of the LSI-11. How hot did the Alpha chips get?

DOBBERPUHL When we started on the first Alpha chip design, which was in about 1987 or 1988, we did an extrapolation of where we thought the industry leaders would be in terms of clock rate and performance over the two-and-a-half to three years it would take to do the development.

At the time I think that people were in the 30- to 33-megahertz clock range in 1987-88. We projected that they would not be any higher than 100 megahertz by 1991, so we set our goal as 200 megahertz in 1991. Everything else in that development was subservient to the 200 megahertz.

That was the goal. Whatever it took. If we could get the current in and the heat out, that was all that mattered. It turned out that the power dissipation of the 200-megahertz Alpha, which was in 0.75-micron technology, was about 30 watts, which was at the time fairly incredible for a CMOS chip.

DITZEL In particular, it's not just the total power, but I think because the voltage is low, the current is fairly high.

DOBBERPUHL It was very high, about 10 amps average current. I remember that the clock driver width totaled about 20 inches to get the sub-nanosecond rise times that we needed. And current pulled during switching was about 50 amps. So it was fairly spectacular at the time, because people really thought about microamps and milliamps in CMOS, and we were talking tens of amps.

DITZEL There were very few chips that took that much power. I remember one engineer saying, "Oh my goodness, the circuit breaker coming into my house barely handles that amount of current. And you're trying to pull that into one small silicon chip." How could you deal with those problems? Weren't there big issues in just keeping the chip cool at that point and getting the power in on the wires?

DOBBERPUHL Sure, I think actually the cooling wasn't as difficult a problem as getting the current in, because we were in wire-bond package technology and the pin counts weren't all that high. So there were two big issues. One of them was just the total current, the average 10 amps. The big problem was the peak current, which was in excess of 50 amps.

The fact that it switched in less than a nanosecond meant that we had to do something different. The big issue was basically the power supply stability—how to get around the V=L*di/dt problems. That's where we came up with the idea of using very large on-chip decoupling capacitors in order to smooth out the current waveforms going into the chip.

DITZEL Did such large amounts of current cause surprises and things you hadn't expected to be problems, which you think you were one of the first to hit?

DOBBERPUHL Fortunately, we were able to identify most of them in advance. I can't recall that we were surprised by anything subsequent to getting the chips back.

But we had to do a lot of things like the decoupling capacitance, which was integrated into the clock driver, which you could see easily with your naked eye on those chips. It went down the center of the chip. Massive transistors plus decoupling capacitors were distributed across the whole center of the chip.

So I think the decoupling capacitance and coming up with rules to manage on-chip inductance, also an issue at the 200-megahertz clock rate, were kind of a first.

Capacitors
DITZEL How did you build those capacitors? I actually think that's going to turn into an interesting issue here. Did you build that out of transistor material?

DOBBERPUHL Those were just ordinary NMOS devices.

DITZEL But you have a very giant transistor, so to speak, that had a lot of capacitance.

DOBBERPUHL Exactly. And that was a bit of an issue for the process guys as far as the amount of total gate capacitance or gate area that was on the chip, in terms of the yield.

But it was also true that we knew at the time that most of the defects in the gates were along the edges; so these things were made relatively square, and you got a good ratio of area to perimeter.

DITZEL This is one of the cases where reading the textbook example of how you solved the problem may be changing with time. As we go into deeper sub-micron technologies, we're finding that gate leakage, which was almost insignificant when you did those early processors, now could turn out to be a larger problem, that building a capacitor that way might generate too much leakage.

DOBBERPUHL It's a big problem. Leakage, as I'm sure we'll get into, is a big issue today. It was much less of an issue 10 years ago. These large decoupling capacitors, if you just make them out of modern thin gate oxide, are way too leaky.

DITZEL So in the future people are going to have to find new ways of building capacitors.

DOBBERPUHL There are a lot of different things you can do, including having a thicker layer of dielectric in the capacitor. Basically the transistor is suffering from the same problems. The capacitor is an easier problem because you have a lot of area available, so you don't need the capacitance per square meter in a capacitor that you do in a transistor.

StrongArm
DITZEL You went from building a chip that had one of the highest amounts of power in the industry in the Alpha to next working on the StrongARM chip [http://www.arm.com/armtech/ARM_Arch?OpenDocument], which was a very, very low-power chip. That's going from, "I don't care about power, I'll use whatever it takes to get megahertz," to doing almost the opposite. How did you undergo this religious conversion?

DOBBERPUHL Well, it was interesting—not the least of which was the amount of, if you'll excuse the pun, heat we took for the power in the Alpha chip. We wanted to show that not only could we design the highest-performance chip, but we could also design a high-performance low-power chip.

We presented Alpha in 1991 at ISSCC [International Solid-States Circuit Conference], and I think at ISSCC 1992 I was impressed with the work that Bob Broderson and some of his colleagues and students were doing at Berkeley on low-power devices. Bob was preaching the religion of low voltage.

After ISSCC, I went back to Massachusetts and said to the process guys, "Let's run an experiment and see if we can just fabricate a relatively low-voltage, low-power Alpha."

So we reduced the VDD supply from 2.5V to 1.5V. As a result we got a pretty decent performance/power ratio on an Alpha chip. As I recall, the frequency went down to about 100 megahertz from 200, but the power was down to less than 5 watts from the original 30.

DITZEL A huge improvement.

DOBBERPUHL Huge improvement—of course, lower performance but much better performance per watt. That looked encouraging, and we said, "Well, what if we started from scratch and tried to design a really high-performance low-power Alpha chip?" I got excited about doing that.

In the meantime, another group of folks had left Digital, then returned, but were going to be based in Austin, Texas. Among them were Rich Witek and Jim Montanaro.

They also got very interested in low-power high-performance CMOS. But they said, "Instead of building a low-power Alpha, let's soup up an existing low-power chip like ARM." So for awhile we had one group working on a low-power Alpha and another working on a high-performance ARM.

Well, the low-power Alpha turned out to be an interesting technical idea but not an interesting marketing and business idea, because the whole premise of Alpha was high performance. Doing anything of lesser performance didn't seem to make a whole lot of sense.

So that concept kind of died and we all concentrated on the idea of building a high-performance version of an existing low-power chip. We ended up, for various reasons, choosing the ARM architecture.

We set about building what became StrongARM, using, to a large degree, the same circuit techniques that we had used on Alpha, but with a power budget in mind instead of just frequency uber alles.

DITZEL Now you went from a 30-watt part in Alpha. Where did the first StrongARM chip come out when you were first able to measure it?

DOBBERPUHL It was 300 milliwatts at 160 megahertz.

DITZEL So you took off a factor of 100? That's pretty good in terms of technology scaling. One thing that may surprise people is that you didn't have to drastically change your circuit design technique or other issues. What did you have to do differently to get that factor of 100?

DOBBERPUHL We did use pretty much the same design techniques. What changed were the constraints and parameters. For one thing we had to dramatically reduce the transistor count, which is a major factor in power dissipation. So it was a much simpler device—much, much simpler. Our estimate was that a factor of 3x was achieved simply due to the reduction in transistor count.

DITZEL For example, I don't think it had a floating point unit.

DOBBERPUHL That's correct. And the integer data path was only 32 bits versus 64; it was single-issue instead of superscalar, etc.

DITZEL People marvel sometimes at the small handheld devices. But in some sense the capabilities have changed a bit, but I assume those features we once had in the 50-watt chips will start to come back in the low-power chips over time.

DOBBERPUHL For certain, they will. And that's really the challenge we're facing today, in that the technology has advanced to the point where it's very easy to integrate lots of functionality. What's difficult is to manage the power concurrently with all that functionality and performance capability.

It used to be hard to design a 200-megahertz chip, just from the point of view of getting the transistors to go fast enough. Now the issue is they'll go plenty fast, but how do you keep it from burning up?

DITZEL Maybe for the readers here, you could talk just a bit about where power in the chip comes from. What makes the chip get hot, and what are the fundamental components in making the power go up or down?

DOBBERPUHL The power is dissipated mostly in the transistors, either as they switch or as they just sit there and leak.

You can calculate the dynamic power dissipation with the formula P = CV2f, where V is the power supply, C is the capacitance that is being switched, and f is the switching rate. There are some additional factors, but fundamentally the dynamic power is given by that formula.

DITZEL That capacitance is really related to the number of transistors, so you reduce the number of transistors and it reduces the power.

DOBBERPUHL Absolutely. The problem is that we want the functionality associated with large transistor counts and we want the performance associated with high frequencies. So we have to lower the voltage to manage the power. But as you lower the voltage, the transistor performance is reduced. As you reduce voltage, that can have a pretty significant impact on performance. The way to compensate for that is to reduce the threshold voltage (Vt) of the transistors.

DITZEL That sounds simple enough. Why not just make it very low?

DOBBERPUHL Well, we do, but there are problems with that. The leakage current of a MOS transistor is proportional to inverse exponential Vt. So as Vt goes down, the leakage current goes exponentially higher. Also, Vt is a function of temperature. So you have another effect that you have to allow for.

DITZEL Particularly in mobile devices where they're encapsulated, since they can get hot on the inside, which makes the power situation worse.

DOBBERPUHL That's right. Even though the power dissipation of the mobile device is necessarily low to conserve battery life, typically there's not much that you can do to cool the device. You don't want to put a fan in there; in fact, you don't even want to put in a heat sink. A lot of times the case becomes the heat sink. So the internal temperatures can be high. Basically this lowering of Vt is a double-edged sword. You get the improvement of performance but you get an increase in static leakage power.

It used to be that the only people who worried about static power were the watch guys, because the watches had to last for six months or a year or more on their batteries. And that leakage current was significant. But for most applications it didn't matter. Now we've gotten to the point where leakage current can be in the same order of magnitude as the dynamic current of the device.

Leakage
DITZEL Can you talk a little bit more about this leakage thing? What is leakage to the first order? Because we didn't hear about this a few years ago, power was just CV2f. But now we've got leakage coming in as a new factor. What's a simple way to picture leakage for somebody who is not a transistor engineer?

DOBBERPUHL A simple way to think about the threshold voltage is as a barrier holding back electrons, like a dam holding back water. Electrons have various statistically distributed energies, like waves on a body of water. As the barrier or dam is lowered, a higher percentage of waves will spill over the dam. The same thing applies to electrons.

And the other significant factor, which we mentioned before, is gate leakage. There the problem is that the gates are so thin that the electrons will tunnel through from the channel to the gate material and actually create a current.

DITZEL So transistors are now getting so small—maybe they're on the order of 14 angstroms in 90-nanometer technology, and you might want to go to 12 angstroms. I've heard that just that 2-angstrom difference might cause a 10x increase in gate leakage. Is there anything people can do about gate leakage in future devices?

DOBBERPUHL Well, something has to be done, because proper scaling requires both vertical and lateral reduction. And we certainly want to keep scaling according to Moore's law. Obviously, people are working on the problem. One of the ways to attack the problem is to use a higher-density dielectric for the gate material, which would allow you to have a bigger physical dimension between the channel and the gate.

I think there are other techniques that are under development. It's a problem that's being addressed. But I think the leakage problem is a fundamental physics problem that, at the moment, has no real complete solutions. It's going to take a lot of design engineering finesse to manage it.

Fabrication vs Logic
DITZEL I was on a panel at the Kyoto VLSI Symposium recently where there was a combination of people who make semiconductors: the fab guys and the people who design with them, the circuit and logic guys. The fab guys were all saying, "Hey, we solved this problem over the last 20 years. Now it's somebody else's turn because the physics just is limiting us in what it can do."

DOBBERPUHL It is. As you said, we now have vertical dimensions that are on the order of 12 or 14 angstroms, only a few atomic layers.

[Please goto Part 2 - next message]

acmqueue.com
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext