wbmw,
I sense a general disinterest in the subject from you.
On the contrary, you peaked my interest. There was an whitepaper on Tbird L2, that I will try to look up.
I wonder what makes the Athlon caches so uninteresting, but the Pentium 4 caches so very interesting? My guess is that justifying the mystery of lower IPC in the Pentium 4 tends to get people more interested in becoming armchair experts on the subject, while the Athlon is just assumed to be a better micro-architecture (by some).
I think you hit the nail on the head. One think I would differ about is that I doubt that if the L2 operates at half speed, it would explain lower IPC, since the latency of L2 access I believe in the ballpark of 7 or 8, and the potential overhead of 1 on top of that is a small as a percentage, not like 100% overhead of the decoder if it runs 1/2 speed.
L2 issue is a collateral issue, since if it can be determined that the decoder is 1/2 speed, on top of trace cache, and if large chip are of L2 operates at 1/2 of the clock speed, it may mean that of all the various clock speeds on the chips, the largest area may be the one that runs at 1/2 of the marketing speed, and one would have to question if Intel is justified marketing the chips that way.
Joe |