SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Intel Corporation (INTC) -- Ignore unavailable to you. Want to Upgrade?


To: Joe NYC who wrote (151254)12/5/2001 7:38:59 PM
From: wanna_bmw  Read Replies (1) | Respond to of 186894
 
Joe, Re: "There may be pipeline flushes, but I think you are looking for them in the wrong place. It is not the lookup that is time consuming."

Actually, Tenchusatsu has a point. Recall that the execution stage on the Pentium 4 is the 18th stage in the pipeline. Therefore, once the pipeline has been flushed, those ADD instructions have an 18-cycle latency before they reach the execution stage. With a randomized location in the lookup table, the branch predictor is bound to be wrong on occasion. That's because the location of the instruction is not determined until the randomized function has been processed, which is once per iteration. Therefore, branch instructions cannot be predicted ahead of time. You have to take the latency of the longer pipeline into effect.

wbmw



To: Joe NYC who wrote (151254)12/5/2001 7:54:22 PM
From: Tenchusatsu  Respond to of 186894
 
Joe, <What takes time is building up of the lookup table for the case statement (if it is in fact a table). That has to be processed N times per loop iteration.>

The assembly code for the case statement is hard-coded. It doesn't matter whether it comes in the form of a jump table or a branch tree. It does not need to be "rebuilt" per iteration.

Tenchusatsu