<font color=red>Itanium: new benchmarks & new roadmap (INTEL)</font> Posted By johan Tuesday, April 10, 2001 - 7:25:27 AM Our friends at Tweakers report interesting news about Intel's plans with IA-64 and x86 (IA-32).
According to a new roadmap update, Madison and Deerfield will be available in 2005 instead of 2003. Madison is targeted at the market of the Xeon today, while Deerfield is a high end consumer chip like the Pentium 4 today. Intel will cease the design of new IA-32/x86 architectures in 2005. Therefore it is possible that the Pentium 5 (pure speculation : ) is the last Intel x86 architecture...
The Itanium should be Intel's chip in the High End 64 bit server market until 2002. Take that with a bit of salt of course, because the chip might be delayed once more. In 2002, McKinley, which is supposed to be twice as fast as Itanium, will take over.
On a related note,Alexei Pylkin has posted some numbers of an IA-64 compiled benchmark on the technical messageboard:
Recently I've tested Itanium system containing 666MHz Itanium processor, Linux64 OS and Cygnus IA64-gcc compiler (-O3 -funroll-all-loops). For the testing I used C-program implementing 512x512-matrix to matrix multiplication with elements of long, float and double types. So, there were three tests. The numbers are for the Itanium are: long: 11 sec, float: 11 sec, double: 11 sec. For comparison:
Pentium-III/800MHz/PC133/SuSE 7.0 Linux/gcc (-O3 -funroll-all-loops) - long: 6, float: 6, double: 9. DS10-L (EV6/466MHz), Tru64 v5.1, Compaq C (-fast)--------------------- - long: 2, float: 1, double: 2 DS40 (EV67/667MHz), Tru64 v5.1, Compaq C (-fast)---------------------- - long: 1, float: 1, double: 1 The C-code of the test can be dowloaded here.Even Compaq/Alpha's low end offering , the Compaq DS10 (review here) beats the Itanium silly with the Linux Compiler.
But then Alexei tested with a new compiler:
And now I used another compiler - SGI Pro64. It delivered totally different numbers! But still, Itanium performance is low as compared to Alpha 21264.
1) Results for 512x512-matrix to matrix multiplication with elements of long, float and double types: long: 2 sec, float: 1 sec, double: 1 sec
2) Results for 1024x1024-matrix to matrix multiplication with elements of long, float and double types:
Itanium 667:------------------------- long: 49 sec, float: 20 sec, double: 31 sec DS20E EV67/667 Tru64 v5.1, Compaq C:- long 9, float 9, double 11 ES40 EV67/667 Tru64 v5.1, Compaq C:-- long 10, float 8, double 10 DS10-L EV6/466 Tru64 v5.1, Compaq C:- long 23, float 16, double 23 P-III/933 Win2K/MSVC 6.0------------ -long 38, float 38, double 60
Seems like SGI has some superb compiler writers... But Compaq/Alpha still rules! Especially the 1024x1024 numbers tell us a lot as the error margin is much lower. Notice that even clock for clock, the Alpha still beats the Itanium.
Edit: It's important to keep in mind the effect the compiler has on the performance of this algorithm, especially in the case of Itanium, but also for all the tested architectures in general. The GCC compiler is very poor when it comes to optimization, and in this instance, performance is likely to be further restricted by the use of -funroll-all-loops. GCC has proven to exhibit poor performance on RISC architectures, such as Alpha and SPARC, in many cases. While these architectures are less susceptible to performance degredation at the hands of the compiler, we know Itanium is not, making the worst-case even worse. The other factor weighing heavily on these results is memory bandwidth, especially in the case of the 1024x1024 results. This is illustrated quite effectively by the Pentium III and the 466 MHz DS10-L (2 MB E-CACHE).
aceshardware.com
Milo |