SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD) -- Ignore unavailable to you. Want to Upgrade?


To: fyodor_ who wrote (19123)11/14/2000 9:41:30 PM
From: Scott D.Respond to of 275872
 
re: SIMD II double precision floats

I have some experience with the old FPU math, but so far have only read about MMX and SIMD fpu math (I have used MMX and SIMD heavily for integer math).

From reading the public documents on the Intel site, it appears that the Pentium 4 gains the ability to operate on two 64-bit (double precision) floats in parallel. For example, see the ADDPD instruction description:

ADDPD-Packed Double-Precision Floating-Point Add
Description
Performs a SIMD add of the two packed double-precision floating-point values from the source
operand (second operand) and the destination operand (first operand), and stores the packed
double-precision floating-point results in the destination operand. The source operand can be an
XMM register or a 128-bit memory location. The destination operand is an XMM register. See
Figure 11-3 in the IA-32 Intel Architecture Software Developer's Manual, Volume 1 for an illustration
of a SIMD double-precision floating-point operation.
Operation
DEST[63-0] <-DEST[63-0] +SRC[63-0];
DEST[127-64] <-DEST[127-64] +SRC[127-64];
Opcode Instruction Description
66 0F 58 /r ADDPD xmm1, xmm2/m128 Add packed double-precision floating-point values
from xmm2/m128 to xmm1.



To: fyodor_ who wrote (19123)11/14/2000 9:42:09 PM
From: Scott D.Read Replies (1) | Respond to of 275872
 
re: SIMD II double precision floats

I have some experience with the old FPU math, but so far have only read about SIMD and SIMD II fpu math (I have used MMX and SIMD heavily for integer math).

From reading the public documents on the Intel site, it appears that the Pentium 4 gains the ability to operate on two 64-bit (double precision) floats in parallel. For example, see the ADDPD instruction description:

ADDPD-Packed Double-Precision Floating-Point Add
Description
Performs a SIMD add of the two packed double-precision floating-point values from the source
operand (second operand) and the destination operand (first operand), and stores the packed
double-precision floating-point results in the destination operand. The source operand can be an
XMM register or a 128-bit memory location. The destination operand is an XMM register. See
Figure 11-3 in the IA-32 Intel Architecture Software Developer's Manual, Volume 1 for an illustration
of a SIMD double-precision floating-point operation.
Operation
DEST[63-0] <-DEST[63-0] +SRC[63-0];
DEST[127-64] <-DEST[127-64] +SRC[127-64];
Opcode Instruction Description
66 0F 58 /r ADDPD xmm1, xmm2/m128 Add packed double-precision floating-point values
from xmm2/m128 to xmm1.



To: fyodor_ who wrote (19123)11/15/2000 12:54:56 PM
From: jcholewaRespond to of 275872
 
> If that's the case, then I'm in shock. The Athlon is seriously almost 2x as fast as the P3
> when doing double precision fp. I'm truly in awe of Intel's compiler work then.

Specfp is really, really, really memory dependent. It is the use of prefetch instructions and its better cache subsystem and chipset memory implementation that give P3 its strong performance when optimized.

&nbsp;&nbsp;&nbsp;&nbsp;-JC