SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD) -- Ignore unavailable to you. Want to Upgrade?


To: Ali Chen who wrote (225766)2/11/2007 10:20:21 AM
From: fastpathguruRead Replies (1) | Respond to of 275872
 
Your conclusion is not warranted. It simply means that the latency to main memory is much more important than bandwidth on the workloads you have in mind. If a processor stalls due to a need of 16 bits of data, and the next data hazard is not located sequentially in memory, supplying 64 bits or 128 bits at once does not improve any performance, and even can have an adverse effect by thrashing already cached data.

Cache lines are 64B(ytes) (not b(its)). I don't think the wider L1-core bus will cause any cache thrashing.

And yes, it won't affect the characteristics of memory-bound apps too much, but for loops running out of L1, especially those using SSE registers, it should help quite a bit. Or to put it another way, the doubled FP unit would probably be starved without a complementary increase in bandwidth to L1.

fpg