SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD) -- Ignore unavailable to you. Want to Upgrade?


To: TimF who wrote (21847)12/7/2000 4:39:32 PM
From: PetzRespond to of 275872
 
Tim, the data that fyo posted shows that SSE2 can do two multiplies or two adds every two cycles. In non-SSE2 mode a single multiply takes two cycles and an add is one cycle.

This really doesn't make any sense, but thats what the data says. Most floating point code could take some advantage of being able to do two multiplies, rather than one, in two cycles.

Petz



To: TimF who wrote (21847)12/7/2000 5:07:19 PM
From: fyodor_Respond to of 275872
 
Tim: SSE2 is SIMD if you have a program that does not apply a single instruction to multiple data sets would it still be much faster the standard x86 floating point?

As Petz also hints at, I've been back-tracking quite a bit on the P4 SSE2 implementation. The numbers I dug up are here:

Message 14961526

-fyo