SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD) -- Ignore unavailable to you. Want to Upgrade?


To: fyodor_ who wrote (53495)8/31/2001 7:16:19 PM
From: wanna_bmwRead Replies (1) | Respond to of 275872
 
Fyo, now your getting back to my expertise. x86 processor resources are already terribly inefficient. Even when Windows NT shows the processor running at 100%, it's probably only using 30% of its resources (i.e. execution units). 2-way hyperthreading probably only increases that to 50%.

wanna_bmw



To: fyodor_ who wrote (53495)8/31/2001 8:49:07 PM
From: Saturn VRead Replies (1) | Respond to of 275872
 
Ref < adding HyperThreading doesn't increase the resources of a processor. It only allows the processor to use these resources better. If programs are already written in such a way that the processor resources are the bottle-neck, no increase in performance will be observed >

Today it is impossible for the processor ALU and FP resources to be the the bottleneck. I am familiar with code optimisation at the assembler level, and today the multiple ALU and FP sit unutilized most of the time. Even the double speed ALU of the P4 is overkill.

The most frequent bottleneck is the the memory. The biggest killer is a L2 miss, which can stall the ALU for a 100 cycles or more.A L1 miss will cause a stall of several cycles too. However if the system is hyperthreaded, if one thread is stalled the other thread is free to use the ALUs and The FPU.

Similarly FPU operation can take a few cycles on one thread. The ALUs will sit idle for a single thread processor because of dependencies of subsequent operations on the result. However the second thread can use the ALUs.

Obviously the thruput will not be 2X of a single theaded system. But an improved thruput of 15%-40% sounds plausible.