SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD) -- Ignore unavailable to you. Want to Upgrade?


To: Joe NYC who wrote (72466)2/26/2002 12:19:01 PM
From: combjellyRespond to of 275872
 
"I doubt that the front end was doubled (decoder / trace cache) from Willy to NW."

The trace cache probably wasn't doubled, but the decoder would almost have to be because of the trace cache since it decouples the decoder from the rest of the pipeline. The trace cache would have to have some added tags so that the thread could be differentiated, but that might have been in there from the start.

As I have stated in the post that started this aspect of the thread, I suspect that Tench is correct when he stated that code would have to written specifically for hyperthreading to show a good increase in performance. This is problematic because I cannot see how the compiler can do this and the responsibility to do it correctly lies with the programmer.



To: Joe NYC who wrote (72466)2/26/2002 3:52:43 PM
From: PetzRespond to of 275872
 
re: Intel is reporting strong benchmark gains from hyperthreading, but no one else is

Well, Intel tried to imply that, but did not say that.

Even on Intel's largest benchmark gain of 80%, how much comes from 120% increase in clock speed, how much comes from a 100% increase of cache size, and how much comes from a 50% increase in memory bandwidth? How much comes from SSE2 instrucions in the Intel-written benchmark? How much increase comes from the celebrated Netburst architecture. HA! We know that one is NEGATIVE.

Whatever the other numbers are, it doesn't leave much performance improvement to attribute to hyperthreading.

Two of the benchmarks Intel used appear to be memory bandwidth limited, because a 1.4 GHz PIII with 512K cache only does 15-20% better than a 1 GHz PIII with half the cache.

On the PIII systems, they used PC133 memory, but on the Prestonia system, they used DDR1600, and possibly that is also dual channel.

I don't think they doubled the decoder. The fact that instructions may be coming from two different threads is irrelevant becuase each instruction stands alone.

Petz