Scumbria, re: Willamette's ALU,
The way I see it, Intel is doing the equivalent of a four-issue integer engine. From the foil posted on AnandTech,
anandtech.com
it seems the limiting factor in IPC (instructions per cycle) would be the trace cache. I assume all of the units between the trace cache and the "Integer RF" (whatever RF means) can process more than four micro-ops at a time.
Of course, it remains to be seen whether four integer operations will ever be squeezed into one clock, considering the heinous nature of x86 code. I would think a double-speed ALU would make much more sense in IA-64, but then again, I'm biased toward IA-64 anyway.
<This is the big question about the Willy ALU. When does it actually remove bubbles from the pipeline?>
Are you saying that the double-speed ALU will only result in larger pipeline bubbles? The way I see it, the sooner integer instructions can be executed, the better. That may not make much of a different in the vast majority of code, but if it can speed up critical sections, such as computational loops where pipeline bubbles are minimized, then all the better.
It's all a question of usefulness. You seem to be arguing that the double-speed ALU is 100% useless. I'm arguing that even if the thing is 10% useful, it will be worth it as long as implementing it doesn't impact overall clock speed.
Tenchusatsu |