SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD)
AMD 244.90+3.4%9:47 AM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: fyodor_ who wrote (79821)5/9/2002 9:01:02 PM
From: pgerassiRead Replies (1) of 275872
 
Dear Fyo:

You forgot that there is also an in between solution to expansion, each memory module has three ports. One towards the root, and a pair to leaves. Thus a single GPU HT link connects to one memory module which connects to two more, which connect to four more and so on. If one chip has 1Gb, which is beginning to be produced, (128MB) is available with one command latency (5 bytes out, 5 bytes in with 64 bytes of data for a total of 74 bytes for a memory read), three chip solutions only add one more 10 byte latency, seven add two, etc. The formula is 2^N-1 chips have at most N*10+69 byte latency period at command rate (yes there might be some store and forward delay per level).

At 16 bits wide bi-directional at a 800MHz DDR clock or 3.2GB/sec using 1Gb DRAM die, 128MB of memory would have a latency of 23ns plus the underlying DRAM latency of course. 384MB would have under 4ns more in latency. Each level adds under 4ns each. 896MB has 29ns, 1920MB has 33ns and 3968MB has 37ns. RDRAM is much higher than this. Using HTT addressing, the link could handle 255 dies with a overall latency of 81ns (adding 35ns for the DRAM) and a memory amount of 31+GB. This is far less latency than RDRAM but with a higher amount of memory (RDRAM maxes at 4GB using 1Gb DRAMs).

Wider HTT just improves on the latency and peak bandwidth at a cost of pins. Narrower HTT has higher latency, but uses less pins, has less bandwidth and may run at higher speeds. In either case, lower addresses of memory would have lower latency than higher ones (or vice versa) and that will allow optimizations to be done. And easily adapted to current MB designs with three slots. The first is the prime module and the other two are leaf modules.

Pete
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext