SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD) -- Ignore unavailable to you. Want to Upgrade?


To: Joe NYC who wrote (196997)5/16/2006 1:29:54 AM
From: dougSF30Read Replies (1) | Respond to of 275872
 
but the contribution will be CPU cycles, and in single digits, as far a CPU cycles are concerned.

That is interesting about helping loads from cache, but reordering can move a load MANY instructions back, as I understand it, which could be 10-20 cycles or so, let's say.

Also, K8's latency to main memory advantage is projected to shrink:

With the numbers available to us now, we have reason to believe that the Athlon 64 X2's latency advantage will shrink to only 15 to 20%. For comparison, the memory subsystem of the Pentium 4 was almost twice as slow as the Athlon 64 (80-90 ns versus 45-50 ns).

So that would put Core2 at, say, 60ns vs 50ns for K8, for main memory latency.

Converting to cycles at 2.6GHz: 1 cycle = .38 ns, so:

Core2: 158 cycles K8: 132 cycles

If load reordering can save 10-20 cycles, that's a large portion of the difference between the two systems even during cache misses.

Now, I don't know what the average reordering win is, and you may be right anyhow, that the biggest contribution comes from cache hit reordering.