SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Politics : Formerly About Advanced Micro Devices -- Ignore unavailable to you. Want to Upgrade?


To: greg nus who wrote (31861)4/14/1998 4:51:00 PM
From: Kevin K. Spurway  Respond to of 1572371
 
All, here's a post I found over on the Betterchips website (www.betterchips.com) describing some of the advantages of the 3D element of K6-3D:

Posted by JC on March 25, 1998 at 16:53:04:

In Reply to: Re: 30% faster...are you sure? posted by Upsilon on March 25, 1998 at 14:53:19:

: : Another reason why I'm still skeptical is that
: : every other website but this one has said that
: : the K6-3D doesn't have a pipelined fpu -- that's
: : in the K6+3D. The 3D instruction set does some
: : real nice things for the fpu, but only if software
: : takes advantage of it.

: As I understand it, assuming the 3D instructions are used
: the data takes a different path and a pipelined FPU becomes
: irrelevant. Now what will use these intructions? A lot.
: DirectX 6 will have it, so that means that any D3D
: accelerated game will use them. Rumor has it that 3Dfx is
: adding support for the instructions in the next version of
: glide, so all native 3Dfx games will use the instructions.
: In addition, there a lot of game companies designing games
: specifically customized for AMD's instructions. The
: biggest is Quake 3, of course, although that's hardly to
: only one.

Okay, this is what (I think) we'll be able to do with the
new instructions:

1. perform two operations at one time on half sized floating
point numbers. Without pipelining, it still won't be too cool,
I think.

Take the MULT instruction. Pretend it has a latency of 8 and a
initiation interval of 4. This means that, if you're pipelining,
you can perform one MULT every 4 cycles.

Now, with AMD3D (which, in this example, is basically MMX for the
fpu), you can split the fp register into two fp's each half normal
size, and you can calculate the same function on both, allowing for
two MULTs every eight ('cuz of latency) cycles.

But with the AMD3D method, in order to multiply on two numbers at
the same speed as Intel's fpu, you have to operate on numbers half
the size, and both numbers have to be given the exact same instruction
(eg: "MULT both numbers by 4.5").

Luckily the K6 fpu has a shorter average latency than the PII fpu
(2 cycles versus 3). This will help. But the pipelined fpu in the
K6+3D will help more.

2. calculate a wicked-fast divide by first finding the
reciprocal of the denominator and then multiplying it by the
numerator. This will probably chop divides at least in half,
and I think divides are unpipelined in Intel chips anyway.

3. use an iterative, very fast algorithm to find a number's
square root to any precision we want. This is the best
feature of the instruction set. If applied properly, you
could chop by ten the time required to do a 3D transform
(hey, wouldn't you like that...400fps in Quake!). But it
probably won't be quite all that.

: Bring the debate on!
Just one more message after this...

-JC