SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD) -- Ignore unavailable to you. Want to Upgrade?


To: Ali Chen who wrote (225597)2/8/2007 2:29:38 PM
From: combjellyRespond to of 275872
 
"The performance lag is in handling of external memory traffic"

I think the biggest lag is in bandwidth to the caches. Core2 has twice the bandwidth to the caches than the K8 has. This is addressed in Barcelona, which doubled the datapaths to the caches. So for certain workloads, Barcelona could have an IPC advantage over Core2.



To: Ali Chen who wrote (225597)2/8/2007 2:29:58 PM
From: fastpathguruRead Replies (1) | Respond to of 275872
 
The list you compiled looks very impressive indeed. Unfortunately, I don't see much in it that would address optimisation of outside memory traffic, the area where AMD is currently quite behind Intel. From what I am gathering, current AMD Opterons are about equal in terms of _raw_ IPC performance to Core 2. The performance lag is in handling of external memory traffic, the issue that was always overlooked historically, misunderstood and under-appreciated. Until AMD manages to manufacture caches as big as Intel does (and same low latency), no nitpicking in _raw_IPC_ would regain performance parity, IMO.

Not sure if you're saying what you sound like you're saying... AMD has the superior interface to external memory, via the integrated memory controller. Intel has compensated for their deficiency with their large caches. Now AMD is also going to have larger (even if not as humongo as Intel's) caches. That list also includes many memory hierarchy related improvements such as increasing internal bandwidth, etc.

fpg



To: Ali Chen who wrote (225597)2/8/2007 4:39:39 PM
From: jspeedRead Replies (3) | Respond to of 275872
 
Until AMD manages to manufacture caches as big as Intel does (and same low latency), no nitpicking in _raw_IPC_ would regain performance parity, IMO.

This doesn't make much sense. IPC is the average instruction per clock. Executing the instructions includes getting/storing the necessary data.

So cache size is one aspect of the ability to get and store data (effective memory bandwidth). That is, the the bigger the cache the less cache misses you tend to have. Other aspects are cache latency, bus widths, bus speeds, cache hierarchy etc etc ..

So most of AMD's changes were geared toward increasing memory bandwidth in one way or another. In fact one of the changes was to include more cache. Barcelona has 10.5M cache for starters and will be expandable. Woodcrest has 16M total so there's not really that much of a difference between the two.



To: Ali Chen who wrote (225597)2/10/2007 10:02:39 AM
From: Dan3Read Replies (1) | Respond to of 275872
 
Re: Unfortunately, I don't see much in it that would address optimisation of outside memory traffic, the area where AMD is currently quite behind Intel.

There is a giant weakness in AMD's design right now, and that's the core's connection to cache. That's (FINALLY!) being fixed through a 100% increase in bandwidth. The x86 architecture compensates for its lack of registers (relative to MIPS, IA64, etc.) through heavy use of L1. That core to cache choke point is so overwhelming that the connection to outside memory traffic is rarely relevant - chopping that connection's speed in half has about a 3% to 5% affect on overall performance (populating only one memory channel).