To: Charles R who wrote (63 ) 9/18/1999 7:51:00 PM From: Dan3 Read Replies (1) | Respond to of 271
Re: something that breaks down the latency of RDRAM/PC133/DDR chips... I hope Bilow can step in here, but I will do my best to begin - I hope I don't make too many mistakes! Look at the first page of:usa.samsungsemi.com The timing for various stages of a read cycle is indicated along the bottom of the chart. Note that the first memory transfer is completed in 30ns (4 x 7.5 for PC133 at CAS3, and 3 x 10 for PC100 at CAS 2) PC133 can run at CAS 2, but the general rule is that current 133MHZ parts are speced for CAS 3. Note that VC DRAM parts are specified for CAS 2 at 133, which lowers this time to 22.5 ns. Now look at page numbered page 21 (23 in Acrobat) of:usa.samsungsemi.com Note that valid data (DQA) first appears 16 clocks (40ns) after the completion of an activate command. Now look at the first page of this data sheet and observe that the parts are binsplit into 40, 45, and 53ns parts - only the 40ns parts - the equivalent of a Pentium III 600 or and Athlon 650 can be run this fast. The SDRAM numbers are good for any PC100 or PC133 part, not just the very top level of the binsplit. So add in 5ns (at least!) for real world parts. Now, about that activate command. Originally, Rambus was to be installed in a "wind tunnel" and run active at all times, but that created a lot of heat and it appears (we won't know until somebody actually ships something) that instead the rambus will be running STBY and you will have to add in 4 clocks (10ns) to the existing 45ns typical figure - bringing the total to 55ns before the component has valid data. What percentage of the time will the ACT delay be added in? We won't know until some real world applications show up, but if the memory is active all the time, the RIMM will overheat, so there will have to be some significant portion of memory accesses that add the extra 10ns to the initial 45ns latency period. Rambus has a fast, expensive, bus structure that compensates for most or all of that disadvantage, leaving the time to fill a cache line about the same for Rambus and PC133, but also leaving Rambus at a tremendous cost and density disadvantage. If the ensuing data requests are at continuous increments of the starting address, rambus has a big advantage in that it can stream data continuously at double the speed of PC100 once the latency delay has completed. And PCXXX is limited to a memory page per burst, whereas Rambus can stream (I think) the whole chip's contents. But, unlike video card usage, PC memory usage is rarely such that this advantage can be used. This isn't my field, but I've done my best - I would be grateful if someone could review and correct what I have written. I think that this is pretty much the story though. Dan