Silicon Investor (SI) -- The First Internet Community

STOCKTALK

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor. We ask that you disable ad blocking while on Silicon Investor in the best interests of our community. If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.

Politics : Formerly About Advanced Micro Devices -- Ignore unavailable to you. Want to Upgrade?

To: Brian Hutcheson who wrote (45535)	1/12/1999 1:04:00 AM
From: Petz	Read Replies (1) \| Respond to of 1573074

Brian, doubling bus width. Technically, this doesn't improve latency, the time to get the first instruction or piece of data, it only improves throughput. Thats why K7 with 200 MHz bus and DDR SDRAM looks good to me. Petz

To: Brian Hutcheson who wrote (45535)	1/12/1999 1:09:00 AM
From: Scumbria	Read Replies (1) \| Respond to of 1573074

Brian, get the data from RAM into L2 cache and that is a bottleneck Increasing the bus width would have almost no impact on performance. The jump from a 32 bit non-pipelined bus on the 486, to a 64 bit pipelined Pentium bus, was significant because of the small caches and the fact that it reduced linefill time from 8/16 clocks to 4 clocks. Further increases in bus width or clock speed will not show the same kind of performance improvements. The most you could hope for with a 128 bit bus was a reduction of two clocks per linefill. Scumbria

To: Brian Hutcheson who wrote (45535)	1/12/1999 11:03:00 AM
From: Ali Chen	Read Replies (1) \| Respond to of 1573074

Brian, <but you still have to get the data from RAM into L2 cache and that is a bottleneck> No, it is not true. Let's say the Winstone has 10 applications, 2MB code in each, plus it manages about 80 MB of data during the benchmark. This is roughly 100MB of data to load. With the real memory bandwidth of 100MB/s (as measured by STREAM benchmark; P-II goes up to 200), it will take only ONE SECOND to load all that stuff into caches. Yet the net benchmark run-time is no less than 10 minutes, or 600 sec. Therefore the data/code loads you mention are not the bottleneck. In reality a computer loads/stores a little bit more, but the overall rule for caches still holds: "load once, execute many". That's why the external bandwidth requirements are minimized, as Scumbria-RYBA noted.