SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Advanced Micro Devices - Moderated (AMD) -- Ignore unavailable to you. Want to Upgrade?


To: Scumbria who wrote (640)7/13/2000 8:53:25 AM
From: pgerassiRespond to of 275872
 
Dear Bruce and Scumbria:

Another problem with prefetching is that when the CPU switches context (starts processing another thread or process), all of the prefetched data becomes useless. Thus for heavy loaded servers and high amounts of multitasking, prefetching becomes a real drag on performance. Prefetching is therefore only useful in nearly single tasking programs like simulations, games, and SPEC. It is a drag on things like TPC and web serving.

Even when used in those situations where it can be a help, it needs more instruction decode power and requires the balance to go to more decode and address generation units than ALUs and FPUs. Since P4's trace cache may help in the decode power needed portion, it still needs more address generation units. This is where the architecture of P4 is unclear. Also the trace cache emphasizes the penalties incurred by context switches.

On Athlon, there is one address generation unit per integer ALU and one address generation unit for both FPU pipelines (add and multiply). The Athlon decode units generate up to three address generation operations per CPU cycle and can easily handle the prefetch load. This is currently not respected because of the fact that the current chipsets are memory bus limited. The memory bus bandwidth is less than the FSB bandwidth. When SMP motherboards with single or dual DDR memory bus channels begin to be used in benchmarks, the twin advantages of the core and the PTP bus will become apparent.

Since this will probably happen before or at worst during the Mustang / Corvette rollout, the average Joe Blow might assume that the new cores are responsible when it is really the chipset and memory. I look forward to seeing some benchmarks from DDR equipped motherboards, perhaps as soon as the upcoming platform conference>

Pete