SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD)
AMD 215.65+0.3%Dec 29 3:59 PM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: pgerassi who wrote (5183)8/15/2000 9:21:56 PM
From: jcholewaRead Replies (5) of 275872
 
Re: IPC gains P2 vs Pentium

> Out of order speculative execution is what makes P2 faster than Pentium in addition to superscalar RISC based CPU IN
> SPITE of longer pipeline. Had the P2 just been a longer pipeline, it would not have had more IPC. Each stage has to
> be able to execute more than one instruction at a time for there to be an IPC greater than 1. Pipelining only allows for
> the maximal use of a resource if and only if that resource is the bottleneck of the pipe. Any part that is not the bottleneck
> will not be fully utilized. Thus a single pipeline can only execute one instruction per advance of the pipe. That is what
> IPC, instructions per clock (advance), means. Thus to get a higher than one instruction per clock requires more than one
> pipe. Athlon has 3 decoder pipes plus 9 execution pipes. P2 had three execute pipes and two decode pipes.

I know this is probably somewhat trivial, but shouldn't the fact that the PII had faster level two cache (PII-266's L2 had 33% higher throughput than P55c-266's L2, and the fact that the L2 was on the module instead of on the motherboard might have given it an even greater latency assist), and that the GTL+ bus supports (I'm told) features which make it inherently faster than S7 boards (I'm slightly talking out my ass here, but a friendly Cyrix fellow once suggested some kind of term like "pipelined transactions", and I keep remembering about that), add to the performance at a given clock?

> Now P4 does not have many more pipes than P3. In fact it
> has less FPU pipes.

P6 had ( azillionmonkeys.com -- if I'm understanding it correctly, and there's no guarantee of that! ) one main pipe for FADD, FMUL, and stuff like that, and a pipe of lesser note for FXCH. P4 has ( watch.impress.co.jp -- I had a better image of this, apologies that I am unable to produce it) one pipe for FADD/FMUL/Fetc., and one pipe for FSTORE and similar "maintenance" tasks. Isn't this overall correct? Wouldn't, then, it be more accurate to say that the P4 and P6 have the same number of pipelines (two)?

Of course, if the P6 can run an FSTORE or fp move from elsewhere on the cpu at the same time it runs that FADD or FMUL, then what I'm writing here is totally moot. Doh. :)

Pete: You seem to be an intelligent fellow. I am amazed at how many intelligent fellows are saying stuff along the lines of PIII-1000 ~= P4-1400. I choose to disbelieve this for the sake of my own sanity (and because I have a little more faith in Intel's engineers, if not their x86 management subgroup), but you should probably know that your calculations here are held by others as well.

-JC
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext