SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Rambus (RMBS) News Only
RMBS 111.73+3.7%1:57 PM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: REH who wrote ()10/10/1998 8:22:00 AM
From: REH   of 236
 
Next Cyrix core aims at 600-MHz Pentium II performance level -- Hot CMOS, deep superscalar spice Jalapeno

Oct. 09, 1998 (Electronic Engineering Times - CMP via COMTEX) --
Dallas - Armed with a hot CMOS process from parent National
Semiconductor Corp. and an aggressive superscalar architecture, Cyrix
is taking aim at the high end of Intel's IA-32 processor line. The
company is planning to pop its next hot CPU core, code-named Jalapeno,
just in time to catch Intel in midtransition from the IA-32 to Merced.

"We are aiming Jalapeno at the 600-MHz Pentium-II performance level,"
said Mark Bluhm, Cyrix vice president of engineering. "That will be
much faster than any estimates we have seen of Merced speed on IA-32
code, and it should be competitive with Intel's IA-32 high end at the
time."

According to the company, Jalapeno represents both solid engineering
developments on existing themes and some significant departures from
Cyrix's traditional way of doing things.

In its traditional approach, Cyrix prided itself on high instructions
per clock, trying to execute most X86 instructions in a single cycle.
The architects argued that this higher efficiency would lead to better
overall performance, even if it caused circuit complexities that
limited maximum clock frequency.

The company won the battle, demonstrating benchmark performance on
its M-II processor that exceeded the performance of higher-frequency
Intel CPUs.

But at the same time, Cyrix was losing the war. "The reality is that
retail end users don't buy on performance-they buy on clock frequency,"
stated Stan Swearingen, Cyrix vice president for desktop products. "The
irony is that because we had done a better job on core design, we were
getting beaten on the retail shelf."

This experience led to a new way of thinking. "Before, we would
compromise on maximum clock rate to get better execution speed,"
Swearingen said. "Starting with Jalapeno, we decided to put the
priority on megahertz, and not to trade them easily for higher
performance."

Core checklist

The new approach guided the Jalapeno team through a checklist of
improvements over the current M-II core. "There were things to do to
make the core run faster," Bluhm said, "like larger caches, more
efficient pipelines and more execution units."

The latter capability conceals some surprises that Cyrix is saving
for its paper at this year's Microprocessor Forum, Bluhm said. "In
general, I don't think it's reasonable to issue more than a couple or
three instructions per clock in any general-purpose instruction
set-especially the X86 instruction set," Bluhm said. But it turns out
that you can definitely benefit from having more that a couple of
execution units in the core."

The apparent contradiction may stem from Jalapeno's decision to tune
for megahertz. In order to get the highest possible clock rate, some
critical paths have to be shortened. That, in turn, makes it impossible
to keep some complex-and infrequent-operations to a single cycle. The
architect has the choice of extending the pipeline to give the
operation more cycles in which to complete, or of permitting the
operation to stall the pipeline for a cycle or two.

If the architect takes the latter choice-and many feel that shorter
pipelines are better pipelines-then additional execution units could in
fact increase throughput. If a floating-point operation stalled the FP
pipe for a cycle, for instance, consecutive FP instructions would stall
the processor. But adding another FP pipe-even one that didn't support
all possible FP instructions-could prevent the dispatch stall.

Another potential consumer of execution units is SIMD processing.
Jalapeno will have to face Intel processors equipped with the Katmai
New Instructions (KNI), Intel's second-generation version of MMX. Cyrix
is in the process of deciding whether it will remain with the AMD 3DNow
instruction-set extensions or move to an implementation of KNI.

"We've heard from some ISVs that KNI is superior to 3DNow,"
Swearingen said. "There's discussion on whether AMD intends to evolve
3DNow in that direction, and whether we might go there independently.
One issue is that KNI may use 128-bit registers and some other things
that you can't just adapt to with some simple pipeline changes. We are
still looking into that."

Whatever the decision on execution units, Jalapeno will clearly
require massive amounts of memory bandwidth to keep it fed. In part,
this need will be met by more and larger on-chip caches. While he would
not be specific about Jalapeno's configuration, Bluhm said that large
L2 caches, and even on-chip L3s, may be in order as processors move
beyond 500 MHz. Such cache organizations have in the past been used in
the Alpha CPU family to good effect, even though they can increase die
size enormously.

There will also be a need for high main-memory speed, both in terms
of bandwidth and latency. That means a fast CPU interface to the
Northbridge, and a fast main-memory system.

Until recently, the Northbridge presented a dilemma. Clearly Socket 7
was running out of steam. But with no license for Intel's Slot-1
technology, and given the high cost of that interface, Cyrix appeared
forced into a proprietary bus-the kiss of death in the X86 business.

But Intel has opened a fourth alternative, said Swearingen. "With the
advent of Socket 370, everything changes. Here is an interface that is
inexpensive and implementable. And because it uses GTL-type levels, it
can go 133 MHz or more-much faster than Intel is pushing it."

The interface issue may be moot, since Cyrix apparently plans to
integrate the Northbridge into Jalapeno. That gives the architecture
further advantages. Eliminating the bus crossing between the last cache
and the DRAM controller can gain not only reduced latency but also the
ability to do speculative and anticipatory operations that wouldn't
make sense in a conventional partitioning.

Beyond the Northbridge, the need for speed persists. "You have to
architect a system solution, not just a CPU core," Swearingen said.
"Jalapeno will use a Rambus memory interface. That just makes sense for
the time in which it is coming out-right as Rambus will be crossing
over PC-100 memory buses."

All of this activity will require competitive processes. "In the
past, National has lagged behind in CMOS development," Bluhm admitted,
"but they are working hard to catch up."

Said Swearingen: "We are seeing speeds from National's CMOS-8 process
in South Portland, Maine, that are right on top of what we were seeing
from IBM's CMOS-6S-2. And we have parts in hand from a 0.18-micron
process that is in development."


Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext