SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD)
AMD 214.87-0.1%3:59 PM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: Petz who wrote (21643)12/6/2000 1:48:34 PM
From: jcholewaRead Replies (1) of 275872
 
> From what I've read, not true for double precision (64 bits or 80 bits). In fact, the latencies are HIGHER on
> the P4 than on the P3, but the throughput is exactly the same, i.e., half that of the Athlon core.

I am nearly positive that SSE2 is situated as one pipeline with two units (one for mulpd, the other for addpd). Each unit can only be issued an instruction every other cycle, but an instruction can be fed into the pipeline every cycle. Therefore, you can alternate addpd and mulpd every cycle, which means you have a throughput of one instruction per cycle, or two operations per cycle.

In double precision code alternating between fadds and fmuls, the Athlon can do both one add and one mul per cycle.

If the code happened to be all adds or all muls, then the P4 (in SSE2) would do two operations every two cycle while the Athlon would do one operation every one cycle.

This applies only to 64-bit double precision. I believe that 80-bit extended precision does not apply to SSE2, so in that case the Pentium 4's peak is half that of the Athlon's, yes.

> The problem with the Athlons double precision math advantage is that very often the weak link in the chain
> is the L2 cache throughput or the memory throughput.
> Single channel PC2100 can't match dual channel RDRAM with a 400 MHz bus.

That is a valid assessment. More or less.

    -JC
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext