To: survivin who wrote (48410 ) 7/19/2001 10:54:18 PM From: Dan3 Read Replies (1) | Respond to of 275872 Re: Any ideas what a G4 would get by your measure? The exact degree to which the MPC7400 and the K7 are superscalar can be somewhat confusing, due to the previously mentioned tendency of Motorola and AMD to count functional units differently. Motorola claims that the 7400 is 8-way superscalar, and AMD claims the K7 is 9-way superscalar. These numbers can be misleading. Part of the confusion arises because Motorola counts the G4's LSU as one functional unit. AMD, on the other hand, does not count the LSU as a functional unit, but instead counts the three AGUs as functional units. Motorola also counts the Branch Processing Unit as an FU, while the K7 does not. Finally, the 7400 counts among its FUs something called the System Register Unit, a unit whose functionality is handled by a different part of K7 that AMD doesn't count as an FU. So in the end, if we try to count the 7400's FUs the same way that AMD counts the K7's FUs then we could call the 7400 a 6-way superscalar machine. Let's compare the two back ends side-by-side: MPC7400 K7 1 floating-point unit 3 floating-point/vector units 2 vector units 2 integer units 3 integer units 1 address generation 3 address generation units unit contained inside the LOAD/STORE Unit. As you can see, while there isn't a one-to-one correspondence between the FUs of each CPU, each unit does have a functional counterpart in the other processor. In particular, the K7's floating-point unit wears two different hats, whereas each of the MPC7400's units only performs one specific task. arstechnica.com Vector calculations A comparison of the vector processing capabilities of the K7 and the MPC7400 is worthy of an article in and of itself, and in fact we have such an article in the pipe here. This being the case, I won't say too much here about the subject. I will say, however, that the K7's SIMD skillz look pretty weak when compared to the MPC7400's. The 7400 has a dedicated vector processing unit called Altivec (aka. the Velocity Engine), which can grind out 128 bits worth of vector-calculating goodness per cycle. This vector hardware is fed by thirty-two, 128-bit registers. By way of contrast, the K7 uses its floating-point hardware to perform all vector calculations (3DNow! and MMX). 3DNow! specifies 7, 64-bit registers for storing SIMD data. Of course, those measly 7 vector registers, which are mapped onto the x87 floating-point registers, don't quite tell the whole story. The K7 gets around register-related lockups by using a register renaming with its 88 physical floating-point registers. Finally, the 3DNow! and MMX vectors are only 64 bits wide, as opposed to Altivec's whopping 128-bit vectors. But like I said, we'll talk more about Altivec vs. MMX/3DNow!/SSE in a later article arstechnica.com I'm not sure how to characterize the Altivec unit - 1 instruction acting on multiple data values in the 128 bit vector unit could reasonably be considered an additional 2 ops if 64 bit data values are considered, or as much as 8 ops if 16 bit data values are used, bringing G4 up to 8 to 14 ops per cycle. So Apple could rate it's new 867MHZ processor at 7 to 12GHZ ops. Regards, Dan