To: fyodor_ who wrote (74589 ) 3/15/2002 7:38:32 AM From: Dan3 Read Replies (2) | Respond to of 275872 Re: So quantihertz formula kept constant and only a 38% reduction in core size… Is JC wrong and AMD did double the L2 cache? If not, I certainly hope that there are a few architectural improvements and an FSB bump. P4 probably doesn't scale nearly as well as Athlon, because of P4's relatively limited cache design. Athlon can cache 20 locations per page, while P4 is limited to 8. Athlon can maintain a 64K of instructions and 64K of data in L1, while P4 is limited to 8K data and the 12K instructions. Hit rates will vary with code and data, but Athlon will have a substantially better hit rate than P4. Doubling the "wayness" of a cache is roughly equivalent to a 2 to 3 times size increase, so Athlon's 384K cache performs about as well as P4 with 750K to 1Meg cache. Which is probably the main reason Athlon does so well in server applications. The penalty for going off-cache gets more severe as clock speeds increase and the ns needed for a memory fetch represents more and more processor clocks. Moving to DDR will help P4, but, especially considering that P4 already needs more clocks for a given level of performance, it won't make up for the "performance falls off a cliff" effect when going to main memory at 3GHZ. If a processor stalls on 1 in 100 instructions at 1GHZ, and it takes 60ns to load from main memory, then its "duty cycle" is to work 99 clocks / wait 60 clocks / work 99 clocks / wait 60 clocks / etc. At 3GHZ for P4 that becomes work 99 clocks / wait for 180 clocks... While for Athlon (at 2.33GHZ for 3000+) it's work for 130 clocks (higher hit rate) and wait for 140 clocks (since Athlon achieves its performance at a lower clock).