Silicon Investor (SI) -- The First Internet Community

STOCKTALK

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor. We ask that you disable ad blocking while on Silicon Investor in the best interests of our community. If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.

Politics : Formerly About Advanced Micro Devices -- Ignore unavailable to you. Want to Upgrade?

To: Dan3 who wrote (83233)	12/16/1999 11:02:00 AM
From: Ali Chen	Read Replies (2) \| Respond to of 1582460

Dan, <If I am mistaken, I'd be interested in hearing why.> You are not mistaken, Dan. There is no need to argue with that arrogant but fairly ignorant youth. He is doing what he told to, without any own afterthought. I was trying on several occasion to tell him that the pure bandwidth is not the bottleneck in most applications. The "streaming large blocks of data" is the dying Intel mantra to justify their attempt for Unified Memory Architecture with AGP as a primary "streaming device" to pump texture maps only (that need no processing), and with "Intel inside CPU" as a primary graphics processor. We all know that the demand for 3D has shifted this paradigm from UMA to hardware accelerators with continuously increasing LOCAL memory (since the only local memory can provide the necessary bandwidth for modern video processing). The little relevance of raw bandwidth was proven in practice during several transitions in system memory technology, from FP to EDO, then from EDO to SDRAM, and now we are witnessing another one - from SDRAM to RDRAM. In all cases the raw bandwidth was increased by no less than a factor of two, but the resulting system performance gain was in the range of 3-10% only. Even more, even now a x86 CPU has no internal means to fully utilize even the existent memory bandwidth of regular 64-bit-wide SDRAM. Why this is important? Because if the "streamed data" would have no need to be processed by a CPU, the whole server business could be done by a simple hardware switch-multiplexor! No need for Itanic or Athlon-Sledgehammer! As we all know, this is not the case whatsoever, and at least a CRC needs to be calculated and checked on every data packet, which require a full-blown CPU intervention, not talking about more intelligent stuff like routing or content processing. Maybe Intel has some ideas how to separate raw "streaming" from intelligent content processing using some hardware means, but I am not sure if it bodes well with current software layers and tendencies. Therefore, all his pomposity and claims about holly chipset designability is a BS. Regards, - Ali

To: Dan3 who wrote (83233)	12/16/1999 1:19:00 PM
From: Tenchusatsu	Read Replies (1) \| Respond to of 1582460

Dan, <But their demands on memory aren't for streaming huge blocks of memory, they are demands for many smaller bursts from random locations.> That's where huge processor caches comes in, at least for Xeon servers, and perhaps for Itanium as well (4 MB of off-chip L3 cache in the Merced module). As for EV7, well, that's why they integrated memory controllers right onto the processor core. That's a sure-fire way to reduce the latency of main memory accesses. Especially in servers, the biggest cause of latency is unsustainable throughput. All this nitpicking over the additional latency of RDRAM might mean a few percentage points of performance in desktop systems, but it means absolutely nothing in servers. <But I'm arguing that for almost any server application, it is DDR that has better MHZ to MHZ performance due to lower latency.> No, in fact the performance differences between DDR and RDRAM in servers is inconclusive. It's not clear to me that DDR can use its bandwidth efficiently enough in a server environment to match the potential performance of RDRAM. Besides, the main reason DDR is being pushed over RDRAM is not performance, but cost. That's another debate, however. But like I said before, the miniscule savings in latency that you get with DDR over RDRAM mean absolutely nothing in servers. Don't take my word for it, though. Take what MPR says about HotRail's upcoming 8-way Athlon chipset: Perhaps the most significant problem with the HotRail architecture is the extra latency added by the relatively long path each transaction must take through the chipset. ... [However,] the company points out -- correctly, we believe -- that its advantage in sustained throughput for the whole system is much more important for most server applications. So in short, servers care more about bandwidth than latency. If you can't sustain the bandwidth, then a miniscule latency advantage of DDR over RDRAM isn't going to mean squat. This is different from desktops, where sustained bandwidth is less important, meaning that latency becomes a bigger factor in performance. Back to the original subject regarding Alpha EV7. Yes, 16 RDRAM channels, four per processor, does seem like an insane amount of memory bandwidth. But four RDRAM channels are more easily integrated onto the processor core compared to four DDR channels. And that integration will naturally lead to lower latency. Therefore, EV7-based servers will have the advantages of high bandwidth, low latency, and very sustainable throughput. (In fact, I feel EV7 can seriously challenge Merced/Itanium in terms of performance.) Of course, EV7-based servers with RDRAM will naturally cost more than servers based on an equivalent amount of DDR SDRAM. I guess that's the price paid for the performance. If they decide to switch to four integrated DDR controllers, I'd sure like to know, since there are some major trade-offs to consider here. Tenchusatsu