SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD) -- Ignore unavailable to you. Want to Upgrade?


To: DDB_WO who wrote (227228)3/2/2007 9:17:27 AM
From: RinkRespond to of 275872
 
re: One interesting comment was that Intel's current memory access latency is 100ns while AMD's is 70ns and he suggested that AMD will be at 55ns soon. I assume he was talking about K10. I'm now wondering where the improvement will come from. I wonder if this could be related to the new direct to L1 prefetch versus the older load to L2 prefetch.


DDB, This article seems mentions a lot of reasons for reduced avg main memory access, including the direct to L1-data prefetch that you mentioned, and a generic prefetcher added to the mem contr., both improvements are mentioned page 8 here: anandtech.com

Regards,

Rink



To: DDB_WO who wrote (227228)3/2/2007 10:40:38 AM
From: combjellyRespond to of 275872
 
"This is, what most people seem to miss: that IPC is not the max. sustainable throughput of a CPU's decoding/execution/retirement, but of the system as a whole."

Which is why those who have touted Conroe's 4 execution units are barking up the wrong tree. Sure, there are times when they are used, but it is rare.

"That's the reason, why OOO loads, improved prefetchers (+those in the NB), 32-way L3 cache, separate memory channels and so on are important in this equation"

Improvements to the branch prediction is also helpful. One change that is often neglected is the addition of another, independent path to the L1 caches. This can help hide the latency to the L1, and can be particularly useful for multi-threaded applications running out of L1.



To: DDB_WO who wrote (227228)3/2/2007 1:53:03 PM
From: Ali ChenRead Replies (2) | Respond to of 275872
 
"One interesting comment was that Intel's current memory access latency is 100ns while AMD's is 70ns and he suggested that AMD will be at 55ns soon. I assume he was talking about K10. I'm now wondering where the improvement will come from. I wonder if this could be related to the new direct to L1 prefetch versus the older load to L2 prefetch."

The attempt to reduce main memory latency is indeed a nobel move. However, as I said many times, the advantage of AMD approach to memory handling is highly exaggerated. You need to look at the overall effect of whole memory subsystem, including the art of hardware prefetch, quality of software prefetch, cache miss rates, and FSB/memory penalties. If you look even at somewhat old data,
home.austin.rr.com
the bottom line is that AMD already has 50-60% disadvantage in overall memory waste traffic even as compared to old Pentium D, with older chipset and slower memory, compare line (4) with lines (6) or (7). It translates into 25-30% of loss in overall performance. The data are for SPEC2000, which has smaller data sets than newer SPEC2006, so the gap must be more pronounced in 2006. But even if the AMD statement is true (100ns vs.70/55ns), which I doubt (their base might be quite obsolete), their latency effort is in right direction.

- Ali