SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD)
AMD 214.990.0%Dec 26 9:30 AM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: wanna_bmw who wrote (53159)8/31/2001 6:43:04 PM
From: fyodor_Read Replies (1) of 275872
 
Wanna: adding more associativity does not make the cache more difficult to design. It only requires changes in logic that are straight forward to accomplish. Ask any designer on this forum. It becomes difficult to reduce access times, while at the same time increasing associativity, but that is another matter entirely. What is more difficult is designing a larger cache, because the larger number of transistors can cause a higher defect rate per die, and thus lower yields considerably. Therefore, the designer must include extra redundancy, which increases the size of the die.

I think you guys a missing the forest for the trees on this issue.

Going with a high associativity certainly has its advantages, but also comes with a cost - mainly latency (and power consumption, IIRC).

It seems to me that both AMD and Intel are doing exactly what is in their best interest. Intel chooses to minimize latency, which will tend to give lower performance on "non-optimized" code (boy, that phrase is getting worn these days), but allows much higher performance on really well-optimized code. And since Intel is not only the de facto standard, but is also able to heavily influence benchmarks, it makes sense for Intel to go the low-latency route. AMD, on the other hand, needs its chips to perform well on code optimized for other architectures. This means that it cannot afford to sacrifice performance on non-optimized code, since virtually all benchmarks will be just that, non-optimized.

This same line of reasoning can be used to explain a whole host of other design choices by AMD and Intel. For example, AMD could never afford to leave out a barrel-shifter. While this deficiency certainly hurts P4 performance in actual applications (since shifting and rotating is a standard optimization, employed by virtually all compilers and programmers), all major benchmarks are quickly optimized to use alternate methods (while also seriously hurting Athlon performance, unless the benchmark uses separate paths for P4 and Athlon processors, which I believe at least a few do).

-fyo
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext