To: Petz who wrote (26092 ) 1/23/2001 4:41:36 PM From: milo_morai Read Replies (1) | Respond to of 275872 By: gollem $$$$ Reply To: None Tuesday, 23 Jan 2001 at 3:03 PM EST Post # of 63572 Intel Itanium sneak preview part 2 athena.tweakers.net Intels Tahoe architecture Tahoe, the codename for Intels new ISA (instruction Set Architecture) based on EPIC has received the official name Intel Architecture 64. In IA-64 the instructionbundles are 128 bit and contain a maximum of three instructuions of 41 bit each, as opposed to IA -32 which did not have fixed instruction length. These three instructions use 123 bit so that leaves 5 bit for extra information about the instructionbundle which helps the cpu use its resources more efficiently. athena.tweakers.net Apart from the large step from CISC to EPIC Intel made the leap from 32 to 64 bit. What exactly does this mean? First, the cpu can work with values that are 64 bit (8 byte), this means larger numbers and greater accuracy than 32 bit cpu's. Second, the adressbus can be expanded to 64 bit which means the IA-64 architecture can adress a maximum of 18 Exabyte of memory. The 4 GB that Windows 98 can use and even the 64 GB that Windows 2000 Datacenter Server in combination with Xeon cpu's can adress pale in comparison. Additionally, software pipelining has been added in IA-64. This feature that is also called register rotation is especially usefull for actions that have to be applied to different variables, like in while and for loops that every programmer knows. Instead of applying the same actions on a row of values one by one the cpu simply moves the values in the registers one place. In reality the names of the registers are changed and the data is not physically moved but the effect is the same: multiple variables in diferent stadia of manipulation are present at the same time. This saves a lot of time because the code is smaller and less is needed from memory. athena.tweakers.net The main principle behind IA-64 is functioning in parallel, an IA-64 cpu must be capable of doing a lot of things simultaneously. For this not only do you need a lot of execution units but also a whole load of registers. The IA-32 cpu's that we know now generally have 32 or fewer of these registers, IA-64 cpu's will need over 256. Apart from being able to work in parallel another advantage of the many registers lies in the fact that complex calculations with many values can be executed quickly, without needing to fall back on the cache. The performance of an IA-64 cpu is enormously dependent on on its compiler. That is a disadvantage because a compiler is very complex to program, it has to take an enormous amount of factors into account to analyse the source code and produce the instructionbundles. Someone writing such a compiler must know the cpu architecture inside-out. The compilers for IA-64 have been under developement for almost as long as the the architecture itself but in spite of all the effort they won't be optimal when released. By means of optimizations and new ideas for smarter compilers the performance of IA-64 cpu's will be able to increase considerably over the years. athena.tweakers.net Now that you know what EPIC and IA-64 are the time has come to talk about the very first IA-64 cpu, known at intel under the codename Merced, for the public: Itanium. The physical properties of Itanium aren't impressive at all. Six layers of aluminum with a lowly 25 million transistors produced at 0.18 micron, running at a clockspeed not exceeding 800 Mhz. Even the Pentium III core seems more advanced with its 28 million transisitors and a current top speed of 1 Ghz, let alone the P4 with a 42 million transistor count at 1.5 Ghz, still, for a chip that should have been released in 1997, not bad. athena.tweakers.net athena.tweakers.net Inside the Itanium core is a lot more interesting, it is capable of processing two 128 bit IA-64 instruction bundles at the same time. This means a maximum of six instructions per clock tick and because some instructions require multiple operations that can be done simultaneously by Itanium, this number of operations per clock tick can rise to 20. For this a small army of execution units and registers is deployed. The 11 execution units and 328 registers can produce a mtheoretical maximum of 6.4 GFlops. This enormous amount of resources enables the Itanium to function parallel as required by the EPIC philosophy. Here is a breakdown of the internal features: Execution units 2 Floating point units 4 Integer units 2 Load-store units 3 Branch units Registers 128 Multimedia registers 128 Floating point registers (82 bit) 64 Predicate registers 8 Branch registers The Itanium also has on-die cache, consisting of 16KB L1 data, 16KB L1 integer and 96 KB L2 cache. This seems rather small for a cpu with such features but is amply compensated for by Intel by using L3 cache. 2 or 4 MB can be placed on the processorcartridge, connected to the core with a 128 bit bus. Because the cache runs at full cpuspeed, total bandwidth to the cpu becomes 11.9 GB/s for the 800 Mhz version. athena.tweakers.net Naturally the Itanium had to be reliable too, ECC has been applied to virtually every internal cpubus, this enables the identification and correction of errors without the need for a reboot. Errors can be logged too and the cpu is capable of recognising errors in the rest of the system and correct or isolate them. The Itanium does not rely on the motherboard to ensure a stable voltage either. On the side of the big black cartridge that houses the core, an intel designed voltage regulator, even bigger than the cpu itself, should be connected. athena.tweakers.net athena.tweakers.net Then there's the problem of old software, although Itanium is a completely new design Intels customers will want to use software that hasn't been ported yet to IA-64 every now and then. To make this possible the Itanium has a hardware decoder for IA-32 that's 100% compatible with current software including MMX and SSE2. Ofcourse the EPIC core can't do anything with these instructions so everything is translated so that the Itanium can execute it and the old software can understand the output. The chipset the Itanium uses is the Intel 460GX, capable of running with a maximum of 4 Itanium cpu's on a dual pumped bus at 133 Mhz (effectively 266 Mhz). The chipset can adress a maximum of 64 GB of memory with a choice between PC100 SDRAM and PC1600 DDR SDRAM. An integrated ethernetcard and AGP 4x are optional. For bigger stuff you'll have to go to different companies like NEC, IBM, HP and Compaq, they are working on 16- and 32-way Itanium chipsets. For such a 32-way system you'll have to break the piggybank btw, an 800 Mhz Itanium with 4MB L3 cache will cost over 4000 dollars. It's clear the Itanium is absolutely not a desktop cpu. (Voluntary Disclosure: Position- Long) ragingbull.altavista.com Nice M.