Warning: Technical discussion to follow…
Tenchusatsu,
I've been reading up on the Rambus technology and talking to a friend who designs ASICs, and believe I'm finally getting a handle on the technology (both standard SDRAM and RDRAM). I'd like to advance the following statements and would appreciate it if you could help clear up any final misunderstandings I may have.
1) Rambus appears to use regular 100Mhz SDRAMs internally. A Rambus channel (from the chipset to the RIMM) is 2 bytes wide. By transferring 2 bytes every 1.25 ns, you get 16 bytes every 10 ns. The data path inside the RIMM is 16 bytes wide, so each byte inside the RIMM must be available every 10ns. Which is 100Mhz speed (and I think I've seen statements before that they use a regular 100Mhz SDRAM core). The implication is that when 133Mhz SDRAM is available, Rambus memory can immediately rev up to 1.6 gigabytes/second * 1.333333 = 2.13 gigabytes/second per Rambus channel (assuming that the transmission line effects are still covered, that chipsets are made available, et cetera - how much other stuff do you think would have to change?). This would also imply that the folks at Rambus would be very interested in seeing continued improvements in the speed of the basic SDRAM technology. Finally, it also points out that the die size increase in Rambus memory is due to the other control circuitry that must do things like capture and store the 16 bytes to be sent out 2 at a time, and capture and store the Row and Column data (more about this below) on top of the standard SDRAM circuitry.
2) I think I understand your statement about Stubs and DDR DRAM, although what I'm about to say is probably a huge oversimplification. In standard SDRAM (and the DDR I spec works the same way, I guess), when the signal from the chipset hits the first DIMM socket, it actually splits into two directions - one up into the DIMM in that socket and one on to the next DIMM socket. Based on the impedance in each direction, the voltage drops along the signal, resulting in signal degradation. Also, just as a wave reflects off the edge of a pool, the signal going up into the DIMM gets reflected back. Since electricity travels roughly 1 foot/nanosecond, if the signal up into the DIMM reflects back after 2 inches (a guess), then you'd get a reflection traveling back into the connector about 1/3 of a nanosecond later (4 inches of total travel/12 inches per nanosecond). This wave would then split when it got back to the connector, part going back to the chipset and part going on to the next connector. When it gets to the next connector, it would split again, but it would also meet a similar wave coming back out of the DIMM (ignoring the distance traveled to get to the second DIMM). This would reinforce the first wave as they move on to the 3rd DIMM. In essence, you'd have, in the worst case, a wave growing at each DIMM following 1/3 of a nanosecond behind the original changing signal. At best, the signal would get "fuzzed up" as these reflections keep bouncing back out of the DIMMs. In 100Mhz SDRAM, this isn't that big a deal, because you have multiple (2-4?) nanoseconds of time for the signal to settle. But in a Rambus system, where you have to have everything done in 1.25 ns, a third of a nanosecond would be 25% of the total time - enough to keep the signals from settling on time. Rambus gets around this by not having the signal split - when it gets to the first RIMM, it only goes up into the RIMM. Then it travels very close to each of the pins of the ICs, to minimize reflections, and exits the RIMM through another pin on the RIMM. From there it goes to the next RIMM. No branching, so no reflections into the system. This is also why Rambus systems will require Continuity connectors to be inserted in the empty sockets, whereas SDRAM systems can have open DIMM sockets. Also, it appears that there is a "continuity-checker" loop between the chipset and all RIMM sockets, which would tell me that the chipset can send a message back to the processor to say that "a socket is empty", which would then halt system operation (does this mean that the BIOS has to change to include this as part of the start-up testing?).
3) You mentioned that the control and data signals travel at the same rate in Rambus versus at different rates in DDR DRAM. I'm guessing that traveling at different rates would cause some kind of cross-over noise in the signals (electromagnetic effects?). By having everything at the same rate, you would minimize any cross-over noise. Do you know if SLDRAM addresses this issue?
4) Rambus works in 10ns chunks of time and does not appear to provide any benefit in the setup time it takes to begin retrieving or saving data. It takes 10ns to load the Row command and address data (24 total bits of data sent over 3 lines = 8 subsets of data each sent in 1.25 nanoseconds = 10 ns). It takes 10ns to load the Column command and address data (40 total bits of data sent over 5 lines = 8 subsets of data each sent in 1.25 nanoseconds = 10 ns). To read data, you first send a Row command and bank/page data (which would include the Activate command), then wait 10 ns, then the Column data (which would include the Read command), then you have to wait the typical 30ns. (This is apparently a programmable number in SDRAM by number of clock cycles, though the total time has to be about 30ns. With a slower clock, you might wait 2 clock cycles of 15ns each versus the 3 cycles of 10ns that Rambus has to wait). Then data starts coming back. If the bank/page (?) you want to get data from is already precharged, then you can skip the first two steps and start with sending the Column data.
5) With standard SDRAM, you can define a burst length (one access, for example, might retrieve 256 bytes of data). I understand this has something to do with the row or column address/commands not being able to keep up with the data, or maybe because it's multiplexed (help me here, please!). Rambus does not need a burst mode because while the 16 bytes of data being read are coming back during a 10ns period, you can keep updating the column lines to keep up with the next block of 16 bytes to be read. I'm really fuzzy on this as well.
6) Rambus has more banks of memory. My buddy tried explaining the importance of this, though I'm not completely clear on it yet. Apparently, SDRAM typically is limited to a few banks (4, I believe, if you have 2 DIMMs) whereas Rambus DRAM can have many more (I think 256 if all sockets are filled with the appropriate type of RDRAMs). Can you explain why this is beneficial? He said something about "ping-ponging" which I believe has something to do accessing multiple banks, allowing one bank to (p)recharge while accessing the other, but that's a guess. I think I saw that SLDRAM also supports more banks.
7) The clock signal appears to first travel in through the RIMMs to the chipset, then back out through the RIMMs. When data is being read from memory, the RIMMs coordinate sending data with the clock going toward the chipset. When data is being written back to the memory, the RIMMs use the clock signal coming back out from the chipset. In this way, the data is always guaranteed to follow the clock signal by an appropriate distance. Does DDR DRAM do this or does it use a traditional "master" clock?
8) Rambus works on differential signal levels. As I understand it from my friend, this means that if, for some reason, the reference voltage (which appears to be 1.4 volts) drifts, then the control and data signals would float accordingly (+/- .4 volts from the reference voltage) across the whole system, and signal integrity would be maintained. As opposed to having a fixed voltage, trying to keep a logical 0 at .8 volts and a logical 1 at 0 volts, for example, where the reference voltage might drift to .2 volts and make the logical values asymmetric.
9) Finally, I have to correct a misstatement I've made before on this thread. I've said that the memory and processor have to be some integral multiple of a common clock, but apparently that's not true. First, the processor can be an integral multiple or a "integer and a half" multiple of the system clock. Second, according to my buddy, the chipset is responsible for synchronizing the memory access to the processor - the processor and memory can have completely independent and unrelated clock timings. Are you in agreement?
TIA,
Dave |