SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Intel Corporation (INTC) -- Ignore unavailable to you. Want to Upgrade?


To: pgerassi who wrote (108233)8/24/2000 12:00:05 PM
From: Tony Viola  Read Replies (2) | Respond to of 186894
 
Pete, >Scumbria is not wrong.

Well then this guy is wrong:

Jeff Austin, Intel's IA-32 architect launch manager, said the Pentium 4's 20-stage pipeline suffers no penalty for pre-fetch misprediction because of its use of the NetBurst technology.

However, Jeff Austin has the luxury of going out to the lab and checking performance his engineers are seeing on real hardware. You guys are merely speculating.

============================================================
Intel Offers Details Behind Pentium 4's Speed
THURSDAY, AUGUST 24, 2000 11:30 AM
- TechWeb

Aug 24, 2000 (Tech Web - CMP via COMTEX) -- SAN JOSE, Calif. -- Countering claims made recently by an industry microprocessor research firm, Intel Corp. said the upcoming Pentium 4 has no deep pipeline performance penalty.

Intel (stock:INTC) executives at this week's Intel Developers Forum detailed the Pentium 4's NetBurst technology, which they said significantly increases performance over other processors, while nearly doubling the number of processor pipeline stages.

Jeff Austin, Intel's IA-32 architect launch manager, said the Pentium 4's 20-stage pipeline suffers no penalty for pre-fetch misprediction because of its use of the NetBurst technology.

Misprediction, which sounds like an arcane technical question, is a key performance factor. To increase the speed of operations and data rates, modern processors literally guess in advance what data will be needed.

If the processor guesses wrong, a deep 20-stage pipeline such as Pentium 4 can take up to 13 clock cycles to purge all the data and be refilled, slowing operations.

Bert McComas, an analyst at InQuest Research Inc., Gilbert, Ariz., claimed recently that the pre-fetch misprediction problem causes the 1.4-GHz Pentium 4 to operate at the same performance level as the 1.13-GHz Pentium III.

Austin, however, said NetBurst corrects most of the misprediction problem, with the Pentium 4 performing at the highest level of any processor to date from Intel, Santa Clara, Calif.

Allowing the deep Pentium 4 pipeline to meet performance targets is only one of NetBurst's goals, as the device also aims to provide much faster integer and floating-point-instruction operations.

NetBurst includes Advanced Dynamic Execution, a speculative engine that helps increase memory pre-fetch prediction rates greatly, according to Intel.

The technique uses three times as many instructions operating in pre-fetch as the Pentium III and includes more sophisticated algorithms that look at many prior executions before making a prediction on data to be accessed, Austin said.

The Pentium 4 also features a Level 1 on-chip cache that executes already decoded instructions, thus eliminating latency delays. The L1 cache of the Pentium III, in comparison, must decode instructions each time they are issued, slowing the speed at which data is fed to the processor.

NetBurst's Rapid Execution Engine is another feature and includes an ALU integer-processor running at 2.8 GHz, which is twice the main-processor clock speed and provides extremely rapid processing of integer instructions, Austin said.

A new Streaming SIMD-2 Extension in NetBurst also speeds processing by operating arithmetic integer operations at 128 bits every clock cycle, twice as fast as Pentium III. Additionally, Intel said, the NetBurst adds a 128-bit double precision float point operation not found in the Pentium III.

techweb.com

Copyright (C) 2000 CMP Media Inc.


RELATED SYMBOLS

INTC
73 3/4 -7/8

RESR
5 1/2 UNCH



Enter Symbol:



Enter Keyword:




News Provided By COMTEX



To: pgerassi who wrote (108233)8/24/2000 12:12:11 PM
From: Windsock  Read Replies (1) | Respond to of 186894
 
Re: "Scumbria is not wrong. If the instructions are not in the trace cache which takes a cycle or two to figure out" ... BLAH, BLAH, BLAH.

It is great that PeckerHead and Scumbria know all the reasons that the P 4 architecture will not work.

Unfortunately, the Intel engineers were not aware the design would not work. So they went ahead and made the P 4 architecture work anyway.

Brilliant guy, that PeckerHead !!



To: pgerassi who wrote (108233)8/24/2000 1:41:51 PM
From: Tenchusatsu  Read Replies (1) | Respond to of 186894
 
Pete, you are amazing. You really do sound like you know what you are talking about. Here's an example:

<This is borne out by the Huge size of the trace cache. Originally it was supposed to be no more than a few hundred micro ops. It has since ballooned to 12 thousand ops. This is probably why the die expanded from 170 mm2 to 217 mm2.>

Wow, I never knew there was such an increase in die size. (And I thought at least one of those figures was just wild speculation by the press.) I also never knew about the increase in the trace cache size, or the "fact" that it was supposed to be no more than a few hundred micro-ops.

Here's another:

<This means that the original design goal for IPC or a 10 to 20% loss in IPC was much higher in practice. It was probably more like 30 to 50% loss. >

Only someone who was involved in the Willamette design and validation would ever know all this. Once again, Pete, you are amazing.

Tenchusatsu

P.S. - How come you don't know half as much about the Athlon architecture, given that you are an AMD fan?



To: pgerassi who wrote (108233)8/24/2000 4:00:51 PM
From: Joey Smith  Read Replies (2) | Respond to of 186894
 
Pete, anyone can make propositions (+ or -) on P4 performance based on the limited facts at hand. Even you have admitted there is a disconnect between theory and practice. However, the fact that an Intel designer at IDF specifically said No Performance Degredation clock-for-clock gives me confidence. also, you might want to check out Aces who says p4 will outperform new Athlons clock-for-clock. I also liked the fact Intel was able to run an air cooled .18 P4 part at 2GHz. There seems to be a lot of headroom with this architecture. All-in-all, I'm happy with what I see with P4, which is pretty much a 2001+ product anyway.

Joey