SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Politics : Formerly About Advanced Micro Devices

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: John Evans who wrote (123086)8/26/2000 6:25:53 PM
From: Scumbria  Read Replies (1) of 1584890
 
John,

we don't know enough about the underlying RISC core of the P4 to make comparisons with the P3.

All x86 cores break instructions down into loads, stores, and arithmetics, in a rather similar fashion. Most do it at the front of the pipeline. Cyrix processors do it in the middle of the pipe (using microcode.) It is pretty safe to assume that P4 micro-ops are very similar to those of PIII.

on the P4 the shift instructions won't be double-pumped (will have high latency).

It makes very little difference if they are single or double pumped. Latency is only a problem if it introduces bubbles. Single pumped ALUs do not introduce bubbles.

As you increase the pipeline size, you not only suffer from a larger branch prediction penalty, you also have potentially more dependency and scheduling problems.

Forwarding data is more complicated in a deeper pipeline, but is manageable. For timing purposes, the P4 pipe has a couple of stages devoted to forwarding data. This becomes a performance issue when the pipe is flushed because refill is longer.

Scumbria
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext