Tim, "Pardon my ignorance on the trace cache, and stubbornness:..."
No problem, no one is perfect...
However, you should know that these kids, sandpile.org can write whatever they want, but it, unfortunately, has to be examined with a grain of salt.
Better think this way: according to Intel's own official scientific publication about currently used 0.18um process,
intel.com
their 6-T SRAM cell cannot go reliably above 1GHz. Which, technically speaking, means that no bulk cache can run above that physical limit on any 0.18 Intel device (which we all actually witnessed with 1.13GHz Coppermines BTW).
You probably need to put more attention and thoughts into your own citation:
"Instruction Dispatch 6x µOPs/Cycle 3x µOPs/Cycle Limit imposed by Trace Cache"
What does it tell you if the main pipe can consume 6 uOps per clock (which is 2GHz current maximum), but the Trace Cache "imposes" 1/2 limit to it? Now, if Intel designers have decided to "impose" this limit, why would anyone to make a decoder that would run twice as fast as the cache can consume? Where the result would be stored in this case?
Questions, questions....
- Ali |