To: Joe NYC who wrote (29894 ) 10/14/1998 2:56:00 PM From: Scumbria Respond to of 33344
7 clock access latency for L2 Joe, As clock speeds go up, the number of clocks to access the caches will also have to go up. A lot of transistor switching and electrical communication across long wires is required to do a tag compare and data access from a big cache. If the architects are planning to run at 1GHz (1ns cycle time), 7 clocks (7ns) seems quite reasonable. That is close to the access time of the Mendocino "full speed" L2.K7 Two General Purpose 64-bit Load/Store Ports into D-Cache - 3-Cycle Load Latency - Multi-banking Allows Concurrent Access by 2 Load/Stores M3 16K, 4-way, non-blocking data cache (3 cycle access, 1 load port, 1 store/fill port The K7 can do 2 reads, 2 writes, or 1 read/1 write from the data cache per clock. The M3 can do 1 read and/or 1 write/linefill per cycle. It is interesting to note that both caches have a 3 cycle latency. I don't believe that either AMD or Cyrix has ever built an L1 with more than 1 cycle latency. I haven't had time to look in detail at the specs for either of these processors, but the question comes to mind- are the L1 accesses pipelined? A 3-cycle non-pipelined L1 access would be disastrous for performance, effectively reducing the clock speed by a factor of 3. It seems pretty safe to assume that the K7 will be a higher performance CPU than M3. The L1 cache on K7 is 4X the size of M3, and K7 has many more execution units. The big advantage that M3 may have is the graphics. By putting the graphics unit onboard the CPU die, it can load textures at the full RDRAM rate. K7 and Katmai implementations have to load textures via AGP at 66, 133 or 266 MHz (AGP 4X). Scumbria