To: Ali Chen who wrote (39199 ) 10/13/1998 4:22:00 PM From: Tenchusatsu Read Replies (1) | Respond to of 1573678
Ali, re: <For a 1:1 L2 SRAM, Intel has already demonstrated that for a processor with 12-14-stage pipeline, some extra L2 latency/bandwidth does not matter in terms of real-world performance - compare again your lovely Xeons (1:1 L2) with regular P-II (2:1 L2). For the latter, a cacheline burst is probably 4-2-2-2, which is still smaller than the P-II 12-clock deep execution pipe, and many OOOO (Other Out-Of-Order) instructions can mask the processor bubble anyway> Mendocino's lower-latency on-chip cache does prove that latency does matter in terms of real-world performance, but Xeon also proves that beyond a certain point, faster L2 caches aren't going to help in mainstream applications. (Of course, mainstream apps don't matter for Xeon which is targeted towards servers and high-end workstations, but I'm mentioning the mainstream just to keep the comparison level.) As for "Other OOO" instructions masking the processor bubble caused by the off-chip L2 cache latency, this may be true for clock speeds up to 300 MHz (my guess). But beyond that, the longer L2 cache latencies of the Pentium II will make that processor bubble larger and larger to the point where it can't be masked anymore by instruction-level parallelism. If the latency really wasn't an issue for P6, then Intel wouldn't feel compelled to switch to on-chip caches for all of its 0.18 micron offerings. As for the K7's cache hierarchy, we'll see sooner or later whether AMD made the right design decisions or not. I'm still puzzled that AMD would go with a large 128K L1 cache, knowing that larger caches, especially at the L1 stage, usually mean longer latencies or slower clock speeds. What are your thoughts on this, Ali? Tenchusatsu