McMannis - Looks like AMD's 3DNOW is in Deep Trouble!
Independent software vendors (ISVs) are telling CYRIX that Katmai New Instructions (KNI) are much better than 3DNOW.
And Cyrix may drop the 3DNOW support !
{==============================================} eet.com
"Another potential consumer of execution units is SIMD processing. Jalapeno will have to face Intel processors equipped with the Katmai New Instructions (KNI), Intel's second-generation version of MMX. Cyrix is in the process of deciding whether it will remain with the 3DNow instruction-set extensions of Advanced Micro Devices Inc. (AMD), or move to an implementation of KNI.
"We've heard from some ISVs [independent software vendors] that KNI is superior to 3DNow, Swearingen said. "There's discussion on whether AMD intends to evolve 3DNow in that direction, and on whether we might go there independently. One of the issues is that KNI may use 128-bit registers and some other things that you can't just adapt to with some simple pipeline changes. We are still looking into the question."
{====================================}
AMD's 3DNOW may end up being 3DHOME_ALONE
Paul
{==========================================} Here's the whole article on Cyrix:
eet.com
Posted: 9:00 p.m., EDT, 8/6/98
Cyrix serves Jalapeno to burn Intel
By Ron Wilson
DALLAS - Armed with a hot CMOS process from parent National Semiconductor Corp. and an aggressive superscalar architecture, Cyrix Corp. is taking aim at the high end of Intel's IA-32 processor line. The company is planning to pop its next hot CPU core, code-named Jalapeno, just in time to catch Intel in mid-transition from the IA-32 to Merced.
"We are aiming Jalapeno at the 600-MHz Pentium-II performance level," said Cyrix vice president of engineering Mark Bluhm. "That will be much faster than any estimates we have seen of Merced speed on IA-32 code, and it should be competitive with Intel's IA-32 high end at the time."
According to the company, Jalapeno represents both solid engineering developments on existing themes and some significant departures from Cyrix's traditional way of doing things.
In its traditional approach, Cyrix prided itself on high instructions per clock, trying to execute most X86 instructions in a single cycle. The architects argued that this higher efficiency would lead to better overall performance, even if it caused circuit complexities that limited maximum clock frequency.
The company won the battle, demonstrating benchmark performance on its M-II processor that exceeded the performance of higher-frequency Intel CPUs.
But at the same time, Cyrix was losing the war. "The reality is that retail end users don't buy on performance - they buy on clock frequency," stated Cyrix vice president for desktop products Stan Swearingen. "The irony is that because we had done a better job on core design, we were getting beaten on the retail shelf."
This experience led to a new way of thinking. "Before, we would compromise on maximum clock rate in order to get better execution speed," Swearingen said. "Starting with Jalapeno, we decided to put the priority on megahertz, and not to trade them easily for higher performance."
The new approach guided the Jalapeno team through a checklist of improvements over the current M-II core. "There were things to do to make the core run faster," Bluhm said, "like larger caches, more efficient pipelines and more execution units."
The latter capability conceals some surprises that Cyrix is saving for its paper at this year's Microprocessor Forum, according to Bluhm. "In general, I don't think it's reasonable to issue more than a couple or three instructions per clock in any general-purpose instruction set - especially the X86 instruction set," Bluhm said. But it turns out that you can definitely benefit from having more that a couple of execution units in the core."
The apparent contradiction may stem from Jalapeno's decision to tune for megahertz. In order to get the highest possible clock rate, some critical paths have to be shortened. That, in turn, makes it impossible to keep some complex - and infrequent - operations to a single cycle. An architect has the choice of extending the pipeline to give the operation more cycles in which to complete, or of permitting the operation to stall the pipeline for a cycle or two.
If the architect takes the latter choice - and many feel that shorter pipelines are better pipelines - then additional execution units could in fact increase throughput. If a floating-point operation stalled the FP pipe for a cycle, for instance, consecutive FP instructions would stall the processor. But adding another FP pipe - even one that didn't support all possible FP instructions - could prevent the dispatch stall.
Another potential consumer of execution units is SIMD processing. Jalapeno will have to face Intel processors equipped with the Katmai New Instructions (KNI), Intel's second-generation version of MMX. Cyrix is in the process of deciding whether it will remain with the 3DNow instruction-set extensions of Advanced Micro Devices Inc. (AMD), or move to an implementation of KNI.
"We've heard from some ISVs [independent software vendors] that KNI is superior to 3DNow, Swearingen said. "There's discussion on whether AMD intends to evolve 3DNow in that direction, and on whether we might go there independently. One of the issues is that KNI may use 128-bit registers and some other things that you can't just adapt to with some simple pipeline changes. We are still looking into the question."
Whatever decision Cyrix makes about execution units, Jalapeno will clearly require massive amounts of memory bandwidth to keep it fed. In part, this need will be met by more and larger on-chip caches. While he would not be specific about Jalapeno's configuration, Bluhm said that large on-chip L2 caches, and even on-chip L3s, may be in order as processors move beyond 500 MHz. Such cache organizations have in the past been used in the Alpha CPU family to good effect, even though they can increase die size enormously.
Whatever the cache organization, there will be a need for high main-memory speed, both in terms of bandwidth and latency. That means a fast CPU interface to the Northbridge, and a fast main-memory system.
Until recently, the Northbridge question presented Cyrix with a dilemma. Clearly Socket-7 was running out of steam. But without access to a license for Intel's Slot-1 technology, and with the high cost of that interface, Cyrix appeared forced into a proprietary bus - the kiss of death in the X86 business.
But Intel has thoughtfully opened a fourth alternative for the company, according to Swearingen. "With the advent of Socket-370, everything changes. Here is an interface that is inexpensive and implementable. And because it uses GTL-type levels, it can go 133 MHz or more - much faster than Intel is pushing it. We are actively looking at that."
But the interface issue may be moot: Cyrix apparently intends to integrate the Northbridge interface and control circuitry into Jalapeno. That gives the architecture further advantages. By eliminating the bus crossing between the last cache and the DRAM controller, the architect can gain not only reduced latency, but also the ability to do speculative and anticipatory operations that wouldn't make sense in a conventional partitioning.
Beyond the Northbridge, the need for speed persists. "You have to architect a system solution, not just a CPU core," Swearingen said. "Jalapeno will use a Rambus memory interface. That just makes sense for the time in which the part is coming out - right as Rambus will be crossing over PC-100 memory buses."
All of this activity will require competitive processes. And here, too, the news for Cyrix is good, according to the company. "In the past, National has lagged behind in CMOS development," Bluhm admitted, "but they are working hard to catch up."
"In fact, we are seeing speeds from National's CMOS-8 process in South Portland, Maine, that are right on top of what we were seeing from IBM's CMOS-6S-2," Swearingen said. "And we have parts in hand from a 0.18-micron process that is in development."
Although it has been most noticeable for its aggressive pursuit of the integrated "PC on a chip," a push that will continue with the coming MXi chip, Cyrix fully intends to be a competitor to Intel at the high end of the 32-bit space as well. With its aggressive Jalapeno plans, the company hopes to persuade the market that it can do just that. |