Elmer, here is another very interesting find!
Apparently, doubling the L2 cache was not the only change that Intel made to the Northwood core!
I am now looking forward to running Constantine's C++ benchmark.
-------------------------------
Not just a clock speed increase
One thing that was a pleasant surprise with the new Pentium 4 Northwood processors is that they do offer slightly more speed at the same clock speed than earlier Pentium 4 processors, showing once again that CLOCK SPEED IS NOT EVERYTHING! There is actually about a 5% to 10% jump in performance for most tests between the old Willamette core processors (i.e. the .18 micron Pentium 4 with the 256K L2 cache) and the Northwood (the .13 micron Pentium 4 with 512K L2 cache), even at the same 2.0 GHz clock speed.
Part of this boost is partly due of course to the larger 512K on-chip L2 cache. But as I found in recent testing, it is also due to the fact that certain machine language instructions are actually faster in the new processor. Specifically instructions having to do with branching and jumping through indirect pointers are several clock cycles faster. These are the types of instructions that directly affect the speed of things like, well, emulators (which use a lot of table dispatches), C++ programs (similarly due to a lot of indirect calls through vtables), and even DLL calls in standard Windows programs. Intel has finally started fixing the chip by reducing the clock cycle counts on common instructions.
AMD on the other hand, as I said, the instruction timings have been virtually unchanged for over 2 years. This is good and bad. Good in terms of "if it ain't broke don't fix it", since the Athlon is already very efficient. But bad in terms of that there are known instruction sequences that cause the Athlon to run slower than say, an equivalent Pentium III. Sequences having to do with resolving two memory addresses for example. Or code that is mostly running from L2 cache. The Athlon's L2 cache appears to have a 20 clock cycle latency, versus 3 cycles on the Pentium 4 and 4 cycles on the Pentium III. The Pentium 4 now not only has MORE on-chip cache, but it has FASTER on-chip cache. That's a huge blow to the Athlon.
The current Pentium 4 is now almost right at the specified 80% level of efficiency they planned for it. All in all, between the final 2.0 GHz .18 micron Pentium 4, and today's 2.4 GHz .13 micron Pentium 4, Intel has delivered about a 25% to 30% measured performance increase in the space of only 3 months.
wbmw |