Part 3: Geeks are shortsighted ...
www1.anandtech.com
Page: 8
DDR for Servers: It all makes sense We've heard countless times from Intel that their DDR focus will primarily be for servers, yet Intel does not have a single DDR chipset in their product line. With the Intel Xeon being launched next quarter, and it being a server part, does anyone see a problem with this?
Intel doesn't, because Intel won't be making the majority of the server chipsets for the Intel Xeon (there is the i860 which is a MP version of the i850 for the Intel Xeon); instead, ServerWorks will. We all remember ServerWorks from our recent review of their ServerSet III HEsl chipset, which makes use of dual interleaved 64-bit PC133 SDRAM channels to offer DDR memory bandwidth figures at PC133 SDRAM costs. The problem we ran into with the HEsl was that the Pentium III's 133MHz FSB just isn't enough to demand that sort of memory bandwidth; the Pentium 4's 100MHz Quad-pumped FSB is.
Since the Intel Xeon will share the same FSB as the current Pentium 4, and since we have seen how memory bandwidth hungry a single Pentium 4 processor is, we can only imagine how memory bandwidth hungry two or even four Intel Xeon processors will be. This is especially true if each processor is capable of SMT, where much more data is being requested and used at the same time. Needless to say, even a 4-way interleaved SDRAM memory controller wasn't going to help ServerWorks here.
There is definitely a reason Intel went to ServerWorks for the Intel Xeon chipset: ServerWorks' interleaving memory controller.
Page: 9
Enter the Grand Champion HE The Pentium 4 and Xeon's FSB truly only runs at 100MHz; the bus is "quad-pumped" meaning that the peak effective bandwidth is equal to that of a 400MHz bus but it truly operates at 100MHz. From talking to motherboard manufacturers, it is very difficult to implement a DDR based design where the memory bus frequency and the FSB frequency aren't synchronous. This translates into the only DDR SDRAM that could be paired up with the P4's bus being PC1600 DDR SDRAM or as Intel likes to call it DDR200 since it operates at 100MHz. DDR200 SDRAM "only" yields as much memory bandwidth as a single channel PC800 RDRAM solution, 1.6GB/s in comparison to the 2.1GB/s of DDR266 (PC2100 DDR SDRAM). This is where ServerWorks shows off their skills.
The Grand Champion HE chipset supports up to 4 Intel Xeon processors and to feed them, features a 4-way interleaved DDR200 memory bus.
(100MHz operating clock) * (2 transfers per clock) * (64-bit wide bus) = 12.8Gbps = 1.6GB/s memory bandwidth for PC1600/DDR200 SDRAM
(1.6GB/s memory bandwidth) * (4-way interleave) = 6.4GB/s total memory bandwidth for Grand Champion HE
In order to provide for this incredible memory subsystem, the Grand Champion HE features a Northbridge with close to 1000 pins as well as 5 chips that together make up the memory controller subsystem.
The Grand Champion HE supports up to three PCI-X buses (the number of buses is directly proportional to the number of PCI-X controllers that are present on the board) and it also features a Southbridge which is ATA/100 compliant however is currently only working at ATA/66.
Quad Intel Xeon platform running on a Grand Champion HE Reference Board
Click to Enlarge
ServerWorks is also not interested in any of the incremental DDR speed increases over the next few months. Remember, DDR200 is the only thing they can use with the 100MHz Quad-pumped bus. Whatever DDR technology they implement must run at an initial frequency of 100MHz. The other option they mentioned was support for DDR400, which would be DDR SDRAM running at 200MHz but effectively transferring at the same rate as the quad-pumped 100MHz FSB. They are talking to memory manufacturers about it, however it is still at least a year away from actually being a viable option.
Publicly ServerWorks is stating that their Grand Champion HE will begin shipping in Q3 of this year, however they told us that when Intel has a chip, they would have a chipset. Whether that means Q2 or not is anyone's guess.
Page: 10
ServerWorks on AMD The obvious question for us to ask ServerWorks was what they thought of AMD entering the server market and if they would ever work with them. Their response was actually pretty interesting, and can basically be summed up in a few key points:
1) AMD needs a higher bandwidth FSB: Even with its 266MHz FSB, the Athlon's EV6 bus alone doesn't offer as much bandwidth as the P4/Xeon's FSB. However a counterpoint to be made here is that the EV6 is a point-to-point protocol, meaning that each CPU gets a dedicated path to the Northbridge, which could result in higher overall bandwidth utilization with more than one processor.
2) PCI-X where are you? As we mentioned in our most recent server upgrade article, there is a clear need for higher bandwidth PCI and even 64-bit PCI won't cut it once you add in things like Gigabit Ethernet and high end RAID arrays.
3) AMD needs to invest in the technology: We got the hint from ServerWorks that they didn't see AMD as a heavy investor in the type of technology that these guys are building. The higher margins are in server products, but AMD has to start somewhere so the AMD 760MP may just be the first step into a much larger world.
We showed you earlier a breakdown of the "Intel-inside" systems in the server market, and in there we mentioned that the highest profit margins reside in the hard hitting SMP systems with more than 4 or 8 processors. This is the market that AMD must eventually set their sights on as well. AMD is in a strong situation since their flash memory division has been rather successful, however AMD today is where Intel was at the release of the Pentium II. They have the ability to take their current product line, which is outperforming much of the competition, and segment it even more to tend to the needs of the higher end server and workstation markets. This is what they are trying to do with the release of the 760MP but we won't see the true potential of AMD's branch into this market until the introduction of their Hammer line of processors.
ServerWorks isn't going to burn any bridges with Intel just so they can make a dual processor chipset for AMD. If AMD were to provide a 4 or 8 way ClawHammer/SledgeHammer design that can outperform the Intel Xeon/Itanium then it might just be enough to pique the interests of the folks at ServerWorks. Until that day, the Intel Xeon will be the only CPU enjoying a 6.4GB/s memory bus.
Page: 11
Set your sights on Mt. McKinley The past two IDF Conferences have been centered on the Pentium 4 and Intel's upcoming IA-64 processor, the Itanium. And although the Pentium 4 was present all over the show floor, the Itanium's presence was not nearly as great. Instead, the Itanium's successor, codenamed McKinley was the talk of the town this time.
For an Itanium update, the Itanium is still on schedule to be released in the second quarter of this year. The Itanium will debut at 733MHz and offering possibly higher speed grades, with a backside bus to its off-die L3 cache.
The McKinley, most likely branded under some form of the Itanium name, will be introduced at the end of this year with platforms shipping in 2002. The McKinley will offer some significant improvements over the Itanium, including an updated FSB with 3 times the amount of bandwidth as the Itanium's bus. The McKinley will feature even more execution units for even more parallel processing power and it will move the Itanium's L3 cache on-die.
The fact that Intel has room to store the L3 cache on-die leads us to believe that McKinley will be a 0.13-micron chip which gives it approximately 50% die savings over the 0.18-micron Itanium. This could definitely help increase clock speeds and the yield of the McKinley to the point where it would potentially be a formidable contender in the 8-way server market. Only time will tell.
After just recently receiving the initial stepping of the McKinley processor from their manufacturing plants, Intel had multiple McKinley systems up and running copies of 64-bit Windows XP, 64-bit Linux and HP Unix. No one has ever argued with Intel's manufacturing capabilities, it is only when they try to push the architecture beyond its limits that problems occur (e.g. Pentium III 1.13GHz).
McKinley on 64-bit Windows XP
Click to Enlarge
McKinley on 64-bit Linux
Click to Enlarge
Page: 12
Already making improvements If you'll remember from our IDF coverage around a year ago, we brought you the first pictures of the Itanium. The systems were pretty unique since the Itanium CPUs had power bricks sitting next to them that were just as large in size. The pictures below will hopefully refresh your memory.
Click to enlarge
Click to enlarge
The problem platform designers ran into was that this was simply too big of a design. In a data center, rack space can get pretty expensive (around $1000 USD per rack per month), and with some companies taking up close to 20 racks they obviously want to make the best use of the space they have. However sticking a dual processor Itanium setup in a 1U case (about 2" high) would be impossible. Here's where McKinley comes in.
If McKinley is indeed a 0.13-micron processor (which would make the most sense), it is inherently going to be a cooler running processor than the Itanium. The problem of heat dissipation is not completely eliminated by the fact that it is a 0.13-micron processor, since two McKinley processors will dissipate around 600W.
The design that resulted was quite ingenious. First, the McKinley processor is now integrated into a cartridge with its power supply as you can see below.
Click to Enlarge
If you'll also notice, the McKinley uses the same microPGA pin layout we discussed earlier. Notice how tightly packed those pins are.
The cartridge already saves some space, but we still have the issue of getting rid of the heat. The solution to this problem was to use a very large blower at the front of the 1U rack and blow a lot of air across the two heatsinks allowing the construction of a dual processor McKinley system in a 1U case. The impeller as they call it (essentially a blower) is capable of moving 70 cubic feet of air per minute at a rotational speed of 2800 RPM.
Click to Enlarge
Click to Enlarge
And don't worry, there's more than enough room for memory.
Page: 13
Run for your lives: MTH returns When the i820 chipset was released, OEMs had such a hard time pushing the chipset because of the incredible cost of RDRAM (very close to $1000 for a 128MB stick). Intel's quick and dirty fix was to introduce a chip called the Memory Translator Hub that sat between the i820's Memory Controller Hub (MCH) and the memory banks that would translate the 16-bit, 400MHz RDRAM signal into a 64-bit, 133MHz PC133 SDRAM signal. The same translation would occur on the way back from the memory to the MCH.
As you can guess, this resulted in a huge performance hit. The drop was anywhere between 10 - 20% when compared to regular PC800 RDRAM without the MTH.
On the i840, because of its dual channel RDRAM design, Intel developed another chip called the SDRAM Memory Repeater Hub (MRH-S) that performed the same function as the MTH did on the 820 except it worked with the dual channel memory bus of the 840's MCH. Both the MTH and MRH-S were recalled because of reliability issues, much like those we experienced with Tyan's short-lived MRH-S based Thunder 2400 board.
Something caught our eyes when we were looking at an McKinley presentation at IDF. Since the McKinley will be a server class product it falls under Intel's blanket statement that DDR SDRAM is only for servers. The McKinley will be using the i870 chipset, and below is the block diagram that raised our eyebrows:
Click to Enlarge
We've taken the liberty of blowing up the section that we're interested in, see the MRH-D chips that lie between the i870 chipset and the DDR SDRAM banks? That is a DDR SDRAM Memory Repeater Hub, which sounds a lot like the MRH-S that was made for the i840 that happened to be a variant of the MTH that we all loved to hate.
This could be a huge performance penalty for McKinley as we noticed when we paired up regular SDRAM and the MTH enabled i820 or the MRH-S enabled i840.
The even bigger question is why would Intel release a McKinley chipset without a native DDR SDRAM controller? Maybe Intel is planning on bringing RDRAM to the server markets as well? It would make sense since it would be fairly easy to simply remove the MRH-D and replace the DDR SDRAM slots with RDRAM banks.
Keep your eyes on this one; it will be very interesting to see what Intel does here, very interesting indeed.
We did make sure that the future DDR platforms for the desktop market would in fact be true DDR solutions and wouldn't use a MRH-D controller; luckily we did get confirmation of this. As you can see by the below block diagram of an upcoming desktop chipset codenamed "Plumas" it has an integrated DDR SDRAM controller.
From the block diagram it also looks like this particular chipset has a dual channel DDR controller, however that could also be an artistic element of the drawing.
We've just scratched the surface… There is so much more to report on, we've really just scratched the surface with this first article. InfiniBand, PCI-X, Intel's extreme hatred for Transmeta, the future of desktop LCD and notebook markets, AMD's future, ATI vs NVIDIA in the 3D graphics market and the list goes on. |