Re: Do you actually think that lower latency will be able to sustain performance as processors grow in clock frequency?
Do you think processors will be able to sustain performance as clock speeds increase without lower latency?
Modern chips, with on die L1 and L2 caches have hit rate percents in the high 90's. Data reads can be bytes, words, or longs and instructions are often long words so let's just assume 4 bytes are needed for each of those reads. If the miss rate is as high as 5% (and it rarely is) that means a 10GHZ chip would need bandwidth of 10,000 million / 20 = 500 million 4 byte reads per second or 2 gigabytes per second. One PC266 channel can transfer 266 mhz x 8 bytes = 2,128 megabytes or about 2 gigabytes per second - about what will be used by a 10GHZ processor.
So, one PC266 channel minimally supports the bandwidth needed by a 10 gigahertz processor.
The thing is, when the chip has a cache miss, it stalls until the required instruction or data is available. Even the best PC266 has latency on the order of 50ns - 100 clocks on a 2GHZ chip. So latency is very important. At 10GHZ the chip would be stalled for 500 clocks every time it needed an outside read. Rambus has longer latency times than DDR, which makes Rambus even worse for higher speeds. If AMD can cut latency down by 15 to 20 ns by bypassing a chipset memory controller, it will result in an instant and substantial increase in performance - especially as clock speeds increase.
Is that enough? No, of course not. Even today's processors benefit from higher memory bandwidths. When a miss occurs, there is a pretty good probability that a new branch has been taken and that a number of additional reads will be required that can be predicted and prefetched. So additional bandwidth is helpful and can improve performance. But latency is what's critical. That's why one channel of Rambus can't compete even with one channel of PC133 memory, despite it's much greater bandwidth. |