SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Intel Corporation (INTC)
INTC 37.83-4.3%Dec 12 3:59 PM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: Charles Gryba who wrote (151149)12/5/2001 11:31:47 AM
From: wanna_bmw  Read Replies (2) of 186894
 
Constantine, Re: "why does wbmw keep saying that the P3 has 3 decoders and the P4 one but you say they have the same decoder? Which one is right?"

Joe is incorrect. He is basing his assumption that both processors have the same decoder, but they do not. It's amazing to see such stupidity on this thread right now. Some of the AMDroids so badly want to fit their preconceived conclusion, that they are now bending the facts to fit their own twisted believes.

The *FACT* of the matter is that the Pentium 4 has 1 complex decoder running at up to 2GHz. The Pentium III has 1 complex decoder and 2 simple decoders running at up to 1GHz. What's the difference between a "simple" and "complex" decoder? A simple decoder can decode any instruction that translates into a single uop (i.e. the ADD instruction). A complex decoder can decode any instruction. Why have two kinds? A simple decoder takes far less logic - thus, smaller die size.

Therefore, consider the application. The case statements on each iteration contain an ADD instruction. Kap says the case structure is accessed through some kind of hash table. Therefore, there are some memory movement instructions, some comparison instructions, and some branch instructions. The latter two are probably combined into a compare and branch instruction. Such an instruction requires a complex decoder. The memory instructions also require a complex decoder. The ADD instructions can use the simple decoders.

So on every iteration, you will find that two or one decoder of the Pentium III can be used per clock. Two if there is an ADD instruction inside the prefetch cache, and one if there is not. Since Kap's loop is predominately based around the inside most summation statement, we can assume there are a lot of ADD instructions. Therefore, we expect two decoders (on average) on a Pentium III per clock, while one is running on the Pentium 4.

That accounts for the discrepancy between the two cores if you ensure that there is always an trace cache miss on the Pentium 4.

wbmw
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext