SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD) -- Ignore unavailable to you. Want to Upgrade?


To: Ali Chen who wrote (75626)3/26/2002 2:13:16 PM
From: combjellyRead Replies (2) | Respond to of 275872
 
"They will lose in the long run."

I guess it depends on what you think the long run is. AMD needs a small die until it can get access to 300mm wafers, so basically through 2005. Now that encompasses 130, 90 and possibly 65nm. The problem with the x86 instruction set that you mention is the inability to extract a whole lot of parallelism out of the code. There are several reasons for that, a big one is the restricted number of registers that can be general purpose. Now x86-64 addresses this problem in 16 bit mode and it implements SSE-2, both of which will enable a bit more parallelism to be extracted. It also has addressed the problem of going off chip to an extent by hacking away at latency. In addition, it has reduced the number of cycles to the L2 cache to 8 from around 20 or so of the Athlon. All of this helps. But what about the future?

Well, at 90nm a SledgeHammer with 1Meg of L2 should be down to less than 100mm^2, maybe less than 90. It would also likely have DDRII for memory, keeping it's gains on latency reduction while increasing bandwidth. The move to 65nm would likely see a doubling of L2 cache again, while keeping the die size in the same range. Sure, Intel will likely have a larger L2 cache, but that is counter-balanced because they won't have the latency reduction unless they bring in an on chip memory controller. Which would complicate their SMP model enormously...



To: Ali Chen who wrote (75626)3/26/2002 3:20:18 PM
From: YousefRespond to of 275872
 
Ali,

Re: "4. To get more performance, the off-chip traffic must be
reduced, which means better caches, which means bigger
on-chip caches.

5. Bigger cache requires bigger die. Therefore, strategically,
the 300mm fabbing and big-die big-cache chips will
have increasing advantage, and will be more and more
economical as die shrinks.
"

Ali ... Finally a post that I agree with ... Congratulations !!
I think you are finally starting to really understand the "process thingy".
Low defect density and large wafers will give a fundamental strategic advantage
that the "architecture guys" can exploit.

Again ... Congratulations.

Make It So,
Yousef



To: Ali Chen who wrote (75626)3/26/2002 4:11:23 PM
From: hmalyRead Replies (2) | Respond to of 275872
 
Ali Re...1. In the PC business, performance sells, and will be.
Either MHz-based, which is easy to communicate to
buying public, or true performance, which must have
a clear undisputed lead.<<<<<<<


While that may have been true several yrs ago; what percentage of people need a 2 ghz computer, and one of the fastest growing segments is the low end computer; such that the low end now comprise 30% of the market, and if you throw in notebooks as a low end, but not low price, the percentage of computers sold with a performance drop of better than 20% goes up dramatically. Secondly, the Hammer, with a 20-25 % improvement in IPC over Tbird will have a 40% IPC advantage over P4, which should be enough to get that clear undisputed lead in IPC.

3. Effectiveness of x86 instruction set architecture (ISA)
seems to reach its limit - no matter what the implementation
is, inner performance is about the same.<<<<<


I assume you are talking about IPC here; but doesn't the 20-25% IPC improvement in Hammer disprove that statement.

<<<4. To get more performance, the off-chip traffic must be
reduced, which means better caches, which means bigger
on-chip caches<<<<<<<


Wouldn't the cpu designers follow big iron in designing more parallelism in, and 2 cores as well as bigger caches,

<<Bigger cache requires bigger die. Therefore, strategically,
the 300mm fabbing and big-die big-cache chips will
have increasing advantage, and will be more and more
economical as die shrinks.<<<<<


From what I read about the big die theory, floating around here, about a month ago, The big die theory had less to do with better performance and efficiency than starving Intels competitors of resources (big fabs cost a lot of money) and supposed process superiority. (Only Intel has the ability to make big die chips) If you have a efficient design, and it still requires a big die, then you would be right. If you have a P4, which at almost twice the size of Tbird, still has a hard time keeping up performance wise,Tbird should win in the marketplace, because it can be produced cheaper and has better price/performance.

Smaller-die theory will break: the die cannot be made
smaller than certain size, I guess about 80-100mm2, because
of pad/bump limitations, and current density/power dissipation limits. The smaller die will not scale down<<<<<<


AMD is certainly adding cache etc. as Hammer will be over 100 mm2. I think AMD small die theory, isn't so much about die size as much as about efficiency; getting maximun use of the die size you need, rather than Intels, "Let build it big because we have a process lead."

<<. That's it. I don't know what AMD is thinking,<<<<<<

AMD could be thinking that when computers become commoditsized, the smaller, more efficient, better price/performance cpu will win a majority of customers.



To: Ali Chen who wrote (75626)3/26/2002 8:10:29 PM
From: pgerassiRead Replies (1) | Respond to of 275872
 
Dear Ali:

Re: "1. In the PC business, performance sells, and will be.
Either MHz-based, which is easy to communicate to
buying public, or true performance, which must have
a clear undisputed lead."

Well those that look at true performance seem to agree that AMD has that lead. Hammer will just make it obvious to all.

Re: "2. As core frequencies continue to rise, the gap between memory and processor will continue to widen, and
off-chip traffic will eventually dominate on current
platforms."

This is part of P4's problem. They use bandwidth to attempt to reduce latency. More and more bandwidth is wasted in this pursuit. Thus, the efficiency of the available bandwidth is less with P4, not greater. To make up for this, P4 requires a larger cache among other things. Thus, Intel is going down the wrong path even worse than AMD by your criteria. Hammer has actually more off chip comm power than either P3, P4 or IA-64. And it will be even more efficient of this bandwidth than Athlon.

Re: "3. Effectiveness of x86 instruction set architecture (ISA) seems to reach its limit - no matter what the implementation is, inner performance is about the same."

Yet, Hammer is better than AXP which is better than P3 which is better than P4. IA-64 is not doing as well as P4. RISC has an advantage of performance at a cost of bandwidth. CISC like x86 are better at functionality vs code size. Interpreted languages like BASIC, Forth, Smalltalk and Perl have even a higher functionality vs code size ratio. The IPC of these interpreted high level languages are much higher than x86. So if code and data bandwidth is the issue, going EPIC is not the way. Remember the Z8? It could run BASIC directly (limited yes, but it works).

Heck a Z80 running BASIC uses only 64KB of memory total. You could put that on a single die and its performance may outrun the systems of today and it needs far less bandwidth to do the same job. And its die size would be in the tens of mm2 and it would need only a small ethernet or memory link.

Re: "4. To get more performance, the off-chip traffic must be reduced, which means better caches, which means bigger
on-chip caches."

See above for something that would reduce off chip traffic and use smaller caches. You need to think out of the box.

Re: "5. Bigger cache requires bigger die. Therefore, strategically, the 300mm fabbing and big-die big-cache chips will have increasing advantage, and will be more and more economical as die shrinks."

Only if you are at the same process and you do not use other techniques to reduce this need. How about putting the caches on the memory dies instead? Micron showed that you could put those caches on the NB instead. AMD's Hammer uses a communications net to share the caches of all CPU dies in a system. The fact that it also boosts overall memory size and bandwidth and allows I/O to be used from anywhere within the net is a bonus.

Re: "6. Smaller-die theory will break: the die cannot be made smaller than certain size, I guess about 80-100mm2, because of pad/bump limitations, and current density/power dissipation limits. The smaller die will not scale down
as well as bigger die."

Yet, both Intel and AMD have shown that chip pin density limits seem to recede over time. The current pins/die are are much higher than just a few years ago. Later, they may use current process technologies to link many future process dies together on one substrate. IBM uses such MCMs. When copper based communication hits a wall, there is always optical communication. You can stuff 70K lasers & photo diodes on such a die. That is much higher than your few hundred limit. And each allows for about 10^12 bits a second without wavelength mixing per fiber.

Granted, currently the expense is rather high for it and copper serves well enough now, but in the future that may change.

Re: "7. That's it. I don't know what AMD is thinking,
but the published tactics of "smaller die" is poised
to fail again, IMHO."

Well lets take a different tack. If your big die theory is so good. Would it justify a die twice as large as AMDs? 3 times? Now you know that Hammers will connect between each other without any of that glue logic required of P4s. At 2 times the size, would 2 small die CPUs beat 1 large die CPU? How often would the reverse be true? On most of today's code used in servers and such, 2 CPUs would beat 1 CPU at least 90% of the time. Especially in the standard multitasking environments typically found today erven on users desks.

I submit that two 100mm2 Hammers would outrun a 200mm2 big die P4 Xeon manufactured at about the same time at the same process generation. It is likely that 1 100mm2 Hammer would outrun the P4 more than 50% of the time. Which throws your big die theory into the same heap of failures.

Pete



To: Ali Chen who wrote (75626)3/26/2002 9:14:33 PM
From: Saturn VRespond to of 275872
 
Ali,
I am sad to say that for once I agree with you 100%(almost) !

You expressed your logic and your reasoning very clearly.

Today all the old RISC vs CISC arguments are irrelevant, and the number of registers and execution units does not matter much anymore. It is all about cache architecture and bus bandwidth . SMP is the one innovation which can achieve better utilization of the chip level resources, but needs the software to be rewritten for most common applications. However it is much easier to just do a "dumb increase of cache size", and avoid the messy changes in architecture. Thus the capability to increase the cache size and chip size is becoming a critical competitive factor.



To: Ali Chen who wrote (75626)3/26/2002 9:49:51 PM
From: AK2004Read Replies (2) | Respond to of 275872
 
Ali
We have some indications of amd's roadmap for the next 2 - 3 years which seems to be on the order of life-cycle of amd's recent designs. At that time amd would be getting access to 300mm fab which would increase amd's capacity greatly plus another shrink is expected at that time. I may be way off but it seems like a stretch that amd expects to quadruple an output in that time frame. If I am to speculate than I would say amd is probably building up capacity for k9 :-))

So your arguments would be applicable only over relativly short period of time. While not cirtain, I do hope that hammer performance boost would be sufficient to last that long. PLus sledge is supposed to have larger L2, if I recall it correctly

Regards
-Albert



To: Ali Chen who wrote (75626)3/26/2002 10:34:18 PM
From: milo_moraiRead Replies (1) | Respond to of 275872
 
<font color=blue>Ali/Thread a few great tidbits!Author: ChuckReese Number: of 96147
Subject: Athlon FPU still has more room Date: 3/25/02 12:46 PM

Recommendations: 15

There is an interesting post at Ace's Hardware on the Athlon's floating point performance. Here is an excerpt:

For example. Let us assume that SpecFp 2000 was available at the launch of the Athlon. We would have concluded that the Athlon's FPU (PC100 SDRAM, mediocre compilers) sucks big time. Now just a a year later, the same architecture (with DDR, with better compiler) does a whole lot better - but the FPU hasn't changed.

Without the knowledge that the Flops benchmark gives us, we would have not understood that the Athlon FPU is very powerfull but needs to be unlocked by a fast memory subsystem. 3DSMax benchmarks confirms this and so does SPecFPU.

aceshardware.com

Even with DDR memory, the Athlon floating-point unit is being held back by the memory subsystem. In this light, AMD's decision to change the memory hierarchy for Hammer while keeping the functional units the same makes a lot of sense.

CR


Also

Author: BlackMagicGuy Number: of 96147
Subject: Re: thought on a0 Date: 3/25/02 9:03 PM
Post New • Post Reply • Reply Later • Create Poll Report this Post • Recommend it!
Recommendations: 14

Sharps,

A few details. First, I had said that CH was never run below Itanium speeds. That should narrow your frequency range quite a bit.

Next, there are lots of reasons why first silicon is slow. Process & design are the 2 biggest. Back with the first K5's, we had to turn off everything inside the CPU to get it to do anything. It was a really 'dumb' 8086. With K6, I was given the task of designing an external PLL circuit in case the internal one didn't work. Guess what? I put together a whole lot of cards quickly to get initial silicon running, & it was only at 60% of the launch frequency. The next rev was done quickly & my cards weren't needed any more. We brought up the first Athlons with Digital's chipset. Uggh, those chipsets were $thousands each, and that wasn't something to judge final MB prices on, nor was performance.

Things happen. Slower speeds make lots of other tedious problems at least functional until you get to them & resolve them. But you have to attack the problems one by one. Debugging one problem when another causes instability is one way to go insane. Judging final silicon by the very first samples isn't meaningful at all.

$0.02

Black Magic


From TMF.

Also found this.

Bill Gates and Hammer
infosatellite.com

View UnThreaded • Threaded < Thread • Prev • Next • Thread >
Author: JoshMST Number: of 96147
Subject: Re: Clawhammer HYPE! Date: 3/24/02 12:35 PM
Post New • Post Reply • Reply Later • Create Poll Report this Post • Recommend it!
Recommendations: 27

Some very interesting points and comments here, and some very good questions. BTW I am the guy that runs PenStar Systems, LLC. and wrote the Today series of articles.

I am actually expecting review systems sporting Clawhammers to be sent around to the press in an August timeframe. At that point there will of course only be AMD 8000 based motherboards special made for AMD (reference designs). I could be wrong, but in terms of timing from where AMD is now with their silicon, I don't think I am terribly off base.

I wholy expect to start seeing shipments of Hammer in September and not October, with plenty of choices for motherboards (Hammer has a huge advantage of using 4 layer motherboards at introduction instead of the 6 layer motherboards the original Athlon required). Hypertransport really is the crown jewel of AMD when it comes to this, and board prices at introduction will be far cheaper than AMD 750 and AMD 760 based boards. I have a feeling that VIA will not be the first 3rd party chipset maker to release a Hammer chipset, I think that distinction will go to either SiS or NVIDIA. SiS is being very aggressive in this market, and I am thinking that they are working very close to AMD. NVIDIA is probably the one company in the world has the most experience with Hypertransport, as they have sold millions of products supporting that technology. Their nForce for Hammer could actually be released about the same time as the processors, and it will most likely come with and without video (supplied by the NV-17 core, same as new GeForce 4 MX and Go). If I were a betting man, this is what I would consider (think of it, they already have a very advanced southbridge that supports Hypertransport!). They only need to do a new northbridge (no memory controller, no EV-6 bus protocol, etc.).

From all indications, clockspeed will not be a problem with the Hammer, I would seriously expect the low end to start at 2.0 GHz which would give it a 2600+ rating or so. The high end at release could be up to 2.2 GHz which is around the 3000+ model mark. I am probably pretty conservative here, though I have a sneaking suspicion that these numbers will be higher at release.

I also agree that there will probably be another 2 revisions before anything considered production ready silicon. While specialty dies only take 2 weeks to be fabbed, there is plenty of work going on troubleshooting the finished part and making design changes (which takes weeks and weeks). Production silicon usually takes 6 weeks to fab and package, but the specialty stuff is rushed through (at a higher cost of course). I also wouldn't doubt that engineers right now are watching A2 silicon run at this minute.

If you couldn't tell, I am pretty excited about the Hammer series. Not so much for the actual processor itself, but rather how it addresses so many system level problems. Hypertransport and the integrated memory controller, while not revolutionary, are very important to the overall performance of the system. Now that AMD can keep those functional units derived from the Athlon busy most of the time, the true potential of this processor core can be unlocked. Nifty stuff.


Author: JoshMST Number: of 96147
Subject: Intel and Hypertransport Date: 3/26/02 11:09 AM
Post New • Post Reply • Reply Later • Create Poll Report this Post • Recommend it!
Recommendations: 18

At Comdex this year I had the chance to sit down with Chris Neuts of AMD and Luis Lorenzana of BSMG, both of which are attached to Hypertransport. I asked them how they would feel the consortium would react to Intel joining HT. They both laughed and said HT would benefit greatly from a company like Intel joining, and all of the members would be very, very excited about it. I then asked if they thought Intel would ever join, and they laughed again. Intel isn't really in the business of "joining" anything that they didn't create.

The terms of joining HT is basically what is holding Intel back. First off is the $10,000 to join, which is peanuts for these companies. So money isn't the problem. The problem is with IP. If Intel were to join HT and develop a inter-CPU connect using HT, much of the technology developed for such a device has to be shared with the other members. This would mean that if AMD's HT connections between Hammer processesors is less efficient than Intel's design, then AMD could use that HT interconnect technology developed by Intel. Intel is not out to help AMD in any way, shape, or form whatsoever. Intel will never join HT, which is why Intel started the 3GIO consortium. This allows Intel to control more of the technology and specifications as a founding member. Since 3GIO is a shared bus protocol, I would bet Intel is working on an inter-CPU 3GIO communications system. This obviously has a ways to go, since there is no set 3GIO spec at this moment. Still, while Intel would probably benefit from joining HT, it isn't going to do so due to the slack licensing policies at HT. Take for example if Intel did use HT to communicate with the rest of the system (like the Hammer series does), then Intel no longer has control of the "bus protocol" and any chipset company that is a member of HT could legally make chipsets supporting Intel processors. This would mean that VIA could make a legal PentiumX chipset if they joined, as well as NVIDIA would finally be able to do the same. Intel wants to keep its licensing power when it comes to chipsets.

There would be no legal way of blocking Intel from using HT to connect CPU's (like the Hammer), but the IP and licensing agreements that Intel would have to sign to use it is not beneficial to Intel's way of business (eg. keeping the cookies away from the players that could cause the most competition).



To: Ali Chen who wrote (75626)3/26/2002 11:40:33 PM
From: heatsinker2Read Replies (2) | Respond to of 275872
 
Ali- I think AMD is making a big strategic mistake by taking a low-profile small-size path, with all this short-sighted "revenue per wafer". They will lose in the long run.

Two issues here. First, what is losing? I think you are implying that if AMD can't thump Intel on performance, then AMD (and us) are in trouble. But I would like to point out that we, at least most of us, are not primarily interested in AMD achieving the fastest CPU. What we want is the highest possible stock price. So we can win if AMD doesn't conquer Intel on the technical front. If AMD can just stay close over the next year, I am very hopeful that Wall Street will start valuing AMD with the same P/E as its peers in the semi industry. This would mean a big runup in stock price.

Secondly, AMD's current strategy is AMD's current strategy. Which means that they can change strategy whenever they want. Right now, they have a die size advantage and they don't really have the cash to build 300 mm fabs. So they are pursuing the strategy that makes sense today. There's nothing stopping AMD from moving to large-cache solutions in the future.

BTW, nice post. This is the 8th response. That's a lot for something that is actually on topic.