To: Tenchusatsu who wrote (107904 ) 8/21/2000 5:00:06 PM From: pgerassi Read Replies (2) | Respond to of 186894 Dear Tench: Re: P4 architecture problems Taking your points one by one 1) See Intel presentation at:developer.intel.com . They do talk of some instructions that will take only 1/2 cycle to complete therefore, there must be instructions that are longer (in the optimization guide they talk of avoid shifts because they take far longer in some cases). Besides they have two whole stages (four half clocks) for ALU, so there. 2) They apparently do not have more than two issues in all stages prior to the ALUs and skipping a half stage will cause problems for the back end pipes (they will also need to be wider than two issues or things will pile up there stalling the pipe until they free up reorder buffers). They apparently are counting on stalls to clean out the reorder buffers, which lowers the IPC for sure below theoretical. 3) When no switches are used, Athlon blows by P3. SPEC2000 is a defined dataset, and thus is highly optimized. Without optimization for either, Athlon blows by P3. Only the highly SPEC optimized compiler for Intel allows it to execute faster than the less optimized Athlon (no special compiler until recently and may not be as highly optimized yet). When hand coded for ultimate speed, Athlon beats P3 (not allowed by SPEC). 4) Go to Moldyn, QMC, etc., of Tim Wilkin's benchmarks on JC's website:jc-news.com See the other pages linked to particularly Wilkin's explaination and news page. This shows an Athlon 800 MHz outunning a dual P3 700 MHz machine. Check out Moldyn and Primordia. At no time does the top P3 outrun Athlon either K75 or Tbird Slot A (no socket A scores). From my observations of a P3 700 BX and a Tbird 700 Socket A, both running same software on Linux boxes, The Tbird handled more load and quicker response when running a Oracle v8.0 Database applications and testing loads than, the P3. I think that should cover it. Pete