SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Intel Corporation (INTC)
INTC 36.20+0.1%Dec 26 3:59 PM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: kapkan4u who wrote (150424)11/29/2001 6:23:59 PM
From: wanna_bmw  Read Replies (1) of 186894
 
Kap, Re: "large dense switch statement would always be generated as a computed branch into a table. So the inner loop has a computed branch with randomly generated target. This target can not be predicted by a branch predictor. Additionally, after the branch predictor fails, there will be a trace cache miss because the chance of a hit on this code is worse than 1 in 10. Just compare the number of statements with TC's capacity."

It would probably depend on the compiler, but I admit that you probably know more about compilers than I do.

Clever technique - you can't seem to win your argument about Pentium 4 performance, so you switch the conversation to compilers, and then bait me to argue the topic on your own grounds.

Not that it isn't fun wrestling with pigs (tm Jerry Sanders), but assuming you're right and large dense switch statements end up compiling into tables, it sounds to me like a table would have data, thus making use of the L1 data cache, not the trace cache. The trace cache holds decoded micro-ops, and in the case of a simple loop and table, I am quite confident that the application wouldn't exceed 12k uops.

wbmw
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext