Silicon Investor (SI) -- The First Internet Community

STOCKTALK

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor. We ask that you disable ad blocking while on Silicon Investor in the best interests of our community. If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.

Technology Stocks : Advanced Micro Devices - Moderated (AMD) -- Ignore unavailable to you. Want to Upgrade?

To: Tenchusatsu who wrote (69561)	1/30/2002 5:53:27 PM
From: Joe NYC	Read Replies (1) \| Respond to of 275872

Tenchusatsu, So a 2-way Clawhammer system will probably forego ccNUMA optimization. Like I said before, latency will be impacted, but bandwidth will scale nicely, and in servers, usually bandwidth is more important than latency. It will be interesting to see the latency comparisons between local and remote (1 hop away). It should not be much different than traditional CPUs with off-chip northbridge/memory controller. In case of remote access 1 hop away, the other processor serves as a memory controller for the processor making request, and latency should be about the same of what we are used to. In case of a 4 way system, as shown on Hammer slide I posted earlier today, HT switch is a 4 way switch, with 1 node dedicated to CPUs local memory controller, 2nd to communication with outside world, 3rd and 4th for connection to other processors. In this configuration, there are 2 processors 1 hop away, 1 processor 2 hops away. So if the memory access is not optimized in any way, the average latency should be composed of: 25% from local memory controller (0 hops) 50% from memory controllers on CPUs 1 hop away 25% from memory controller on a CPU 2 hops away. Now let me do a WAG on latency. Suppose latency is expressed in a unit corresponding to traditional memory access from an off-chip northbridge - 1 unit. If local memory access latency is 0.5 of this unit, and 2 hop latency is 1.5, the average latency would be: 25% x 0.5 U = 0.125 U 50% x 1.0 U = 0.500 U 25% x 1.5 U = 0.375 U --------------------- Total = 1.000 U Or latency no worse than a shared bus with off-chip memory controller. Of course, the theoretical total bandwidth of this system would be 4x the bandwidth of a single node. Joe