Silicon Investor (SI) -- The First Internet Community

STOCKTALK

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor. We ask that you disable ad blocking while on Silicon Investor in the best interests of our community. If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.

Technology Stocks : Advanced Micro Devices - Moderated (AMD) -- Ignore unavailable to you. Want to Upgrade?

To: Saturn V who wrote (5158)	8/15/2000 5:14:07 PM
From: Scumbria	Respond to of 275872

Saturn, So by Scumbria's reasoning the Pentium II( III) should have a lower IPC, than the Pentium. The PII had better IPC than Pentium because it is a true superscalar (3-way) processor, and it has a much larger cache than the original 8K-I 8K-D on the Pentium. Pentium-MMX had comparable IPC to PII, because they doubled the cache size. Willy will have a lower IPC than PIII. Caches have already reached the size of diminishing returns, and branch prediction algorithms have too. Scumbria

To: Saturn V who wrote (5158)	8/15/2000 5:16:04 PM
From: Scumbria	Respond to of 275872

Saturn, The double clocked ALU will have minimal impact on IPC. The main impact of the ALU will be that it chokes off the maximum clock speed of the part. Scumbria

To: Saturn V who wrote (5158)	8/15/2000 6:19:25 PM
From: pgerassi	Read Replies (1) \| Respond to of 275872

Dear Saturn: Re: 286 I suggest you look at the documentation again. The 286 is not pipelined. There is no branch misprediction penalty. There is a penalty for taking a branch as the 286 needed to generate an address depending on the type of address used. That was documented as 5 cycles if branch not taken, 8 cycles for immediate, 10 cycles for near offset, 12 cycles for far offset, 10 cycles for register, etc. Notice, no changes due to whether the branch is taken once or five hundred times in a row. Now on a P3, if it does not correctly predict a branch all speculative instructions executed after the wrong branch was assumed must be flushed. It is these instructions that are called the branch misprediction penalty. Now this is typically the length of the pipe between instruction fetch and conditional test. It is this section of the pipe that needs to be flushed so that the correct instructions can begin to be processed. Note, on a vector jump (ie: jmp *0x0(,%eax,4)), it is almost impossible to predict because there are millions of possible jump locations (I believe P3 and all current x86 CPUs simply assume a jump of zero (no jump)) so a branch mispredict penalty is almost certain. Now I do accept that a pipelined processor may not have a branch mispredict penalty if it does not speculate after branch instructions. In the original RISC systems like SPARC, the instruction after a branch is always executed whether the branch is taken or not. This allows for no branch misprediction penalty in a two stage pipeline of fetch, execute. But, this is an advantage for very short pipelines only. Pipelining only makes sense once the number of instructions executed per clock rises near 1. All RISC based CPUs executed at least one instruction per clock (register loads and stores to memory required an extra cycle due to FSB limitations). They were superscalar, when they could execute more than one instruction per cycle. Since there were very few CPUs that took more than two cycles to execute consecutive instructions (the FPU on the K6 was not considered to be pipelined even though it took dozens of cycles to execute a FPU instruction (like ATAN) but could take a new FPU instruction every other cycle), any CPU that regularly took more than two cycles to execute a simple instruction like an or, is considered to be not pipelined. A superscaled CPU might not be pipelined, but I do not know of any. Thus the 286, which took 2 or 3 cycles to do an "or ah,al", could not be considered pipelined. It averaged about 4 to 8 cycles per instruction whereas the 8086 took about 10 cycles per instruction. The 386 took about 3 cycles and the 486 took 1 cycle per instruction. Thus, the 486 is the first x86 CPU that could have been pipelined. The Pentium must have been pipelined because it performed branch prediction. The main reason why the P2 had a higher IPC than the Pentium was that it used out of order execution in conjunction with a larger superscalar superpipelined RISC core. The ideal pipeline processor with prefect branch prediction will only execute one instruction per cycle, an IPC of 1. It takes superscalar processing (multiple pipelines, coprocessors, and/or execution units) to get an IPC above 1. Pete