SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Intel Corporation (INTC) -- Ignore unavailable to you. Want to Upgrade?


To: Joey Smith who wrote (108250)8/24/2000 5:28:36 PM
From: pgerassi  Read Replies (1) | Respond to of 186894
 
Dear Joey:

A launch director is not a designer. A mis-prefetch is an oxymoron. Intel ran a supposed air cooled part for a few seconds (1 to 5) on a hand picked part with a simple clock utility. You can run a Tbird with no heatsink for a few seconds at 1000 MHz. But no one, but a fool, would try to boot a computer up that way. Did they open the case, show it encased in plexiglass, or better yet uncased and show you what cooling method they actually used? The "ordinary" air cooled part looked like a 80mm fan on top of a one pound heatsink. You can throw in a peltier element or two and still call it air cooled. Heck, a Kryotech case can still be called air cooled (the fact that the condensor rather than the CPU is air cooled does not matter). And enough of them exist to not call it special anymore either. All the demo was is a staged act anyway.

Ace did not have a P4 to bench either. His trace cache micro op size is incorrect because he forgot that these are RISC ops and RISC instructions are typically at least as big as the word size plus they need an address to make make sure they refer to the right decoded ops. Thus, a minimum of 8 bytes and more like 10 or more yields somewhere between 96 to 192KB in size. Even Intel admits that the IPC of the P4 is less than that of the P6 (and the P6 is less than that of the Athlon). Since there is no cache in front of the decoders, an additional L2 latency must be added to the decoding pipeline on a trace cache miss which is far more likely when there is a branch misprediction so a miss could easily exceed 30 cycles. Once you go through Johan's calculations, they come out and say that the IPC is definitely lower than that of the P3 directly opposing his own conclusions. There is quite a few negative surprises, "So this seems to be a small mystery.", "P4 will suffer up to 50% more from branch misprediction.", "shift and multiplies are not" (handled quickly), "P4 has a better FPU than the P6!! Why? The P4 offers much more bandwidth via the L2-cache and the 400 MHz FSB than the P6 does" (bandwidth in the fetch does not belong to the FPU), and "Intel reported that the Sysmark Windows Media encoder test reported 50% higher numbers on the P4 1.4 GHz chip (i850 chipset) than the 1 GHz P3 (i820 chipset). Both systems contained RDRAM" (why did they not want to compare it to 815E or 840?). All of these inconsistencies show that Johan does not know about internal CPU architectures and how performance is measured. BTW, Tbird has 65% better throughput on PC1600 than PC133 (or 120% better than PC100) by independent benchmarks (see PCwatch) on the same CPU so, the 150% result above is less than a simple chipset change even with a 140% of clock.

Thus, Johan is definitely wrong, that was no CPU designer (Intel or otherwise), and Intel did not run at a sustained 2 GHz.

Pete



To: Joey Smith who wrote (108250)8/24/2000 9:06:18 PM
From: Dan3  Read Replies (1) | Respond to of 186894
 
Re: fact that an Intel designer at IDF specifically said No Performance Degredation clock-for-clock gives me confidence

The quote I've seen was that branch misprediction due to pipeline length wasn't causing performance degradation, not that there wasn't any performance degradation. It was an ambiguous quote, though it did imply little degradation. It also maintained plausible deniability if there's lots of degradation.

We all have to wait and see.

Dan