"I believe it adds up like this:"
Maybe I should tackle your analysis, point by point.
"4mm^2 for x86-64"
Ok. AMD has been saying 5% for a while, so this is reasonable.
"8mm^2 for 256KB extra cache"
This would be assuming that AMD uses a smaller cell size than they have been doing recently. I don't know why AMD has such a large cell size, since they have been using local interconnect for a while, but there it is. At 180nm, 192k of L2 cache runs about 25mm^2. I just don't expect AMD to make it that much smaller. Besides, I suspect that the L2 cache has been increased, not decreased in size. Just look at slide 18 from the Hammer presentation in October. amd.com It has all of the L2 cache transactions occuring in 4ns at what is likely to be a 2GHz processor. That means that the L2 has a latency of 8 cycles, considerably less than the 20 or so cycles of the current Athlons. cpusite.examedia.nl teamlambchop.com While the P4 is rated as a 7 cycle latency, actual measurements indicate there is more to it than that... cs.umd.edu
So, assuming that the presentation is correct, the L2 cache for the Hammers potentially are 2-2.5 times faster (in cycles) than the current Athlon or the P4. Actual measurements may show a different story, so we have to wait and see. But, clearly, the L2 has been considerably re-worked from the current Athlons.
"8mm^2 for on-die memory controller,"
I think this is on the high side. Unless you are including the crossbar in this estimate.
"4mm^2 for micro-architectural changes such as deeper buffers, larger BHT, TLB, OOO queues, etc."
Sounds about right... |