SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Intel Corporation (INTC) -- Ignore unavailable to you. Want to Upgrade?


To: Paul Engel who wrote (73648)2/14/1999 4:17:00 PM
From: Paul Engel  Read Replies (1) | Respond to of 186894
 
Intel Investors - A Glimpse at Merced & McKinley Architectures - from Intel's Hans Mulder.

The description of the FOUR Floating point units on Merced - 2 EXTENDED Precision and 2 STANDARD Precision - suggest that MERCED may find a more rapid acceptance as a POWERFUL WORKSTATION CPU - before the big SERVER ramp up.

If that turns out to be true, Intel may be able to ramp up production on Merced faster since Workstation applications - and purchases - can be made on an as-needed basis without regard to corporations making life-or-death decisions on Merced-based Enterprise Server purchases.

On chip L0 and L1 caches are also part of the Merced design, with OFF chip L2 caches. McKinley will have on-chip L0, L1 and L2 caches.

Paul

{================================}

techweb.com

February 15, 1999, Issue: 1048
Section: Systems & Software

Intel's Mulder mulls IA-64
Alexander Wolfe

Dallas - Hans Mulder, principal engineer at Intel Corp. and-more importantly-one of the first at the company to recognize the potential of instruction-level parallelism, offered his take on Merced at the Micro-31 conference, held here last December. (Mulder is also a co-inventor of Patent No. 5,860,017; see story at left.)

The original team that took up the task of defining the IA-64 architecture consisted of five engineers each from Hewlett-Packard Co. and from Intel. "We basically wrote the book that contains the IA-64 instruction set," Mulder said during a luncheon keynote speech at the conference.

In 1991, Mulder said, he was the only one at Intel working full time to study the potential of 64-bit architectures. Now, more than 1,000 people are involved in the massive IA-64 project.

Nevertheless, Mulder was highly circumspect during his talk, giving a nod to his constricted position by jokingly characterizing Merced as the most talked-about piece of vaporware he'd ever seen.

On the technical side, he noted that Merced will contain full IA-32 binary compatibility in hardware. In addition, it will have a massive floating-point unit with two extended-precision multiply/accumulate (FMAC) units and two
standard-precision FMACs. That complement will be able to execute up to eight standard-precision floating-point operations per instruction cycle and up to four extended-precision operations.


Merced will have three levels of caches. Most interesting are its separate L0-level instruction and data caches with a latency of two cycles. L0 caches are a recent architectural trend, intended to make frequently used code more accessible to the CPU on a slightly faster basis than standard-issue L1 caches.

Merced's successor, code-named McKinley, will have its L2 cache on-die.
However, Merced will be fielded as a multi-die cartridge containing custom
SRAMs. One reason for the setup is the tricky task of meeting Merced's
power requirements while simultaneously moving to the new physical cartridge
in which it will be housed.

Tough crowd

Mulder let down his hair a bit during a sometimes raucous
question-and-answer session that took place following dessert and coffee.
One attendee pointedly asked him to explain the difference between very
long-instruction-word architectures and EPIC (explicitly parallel instruction
computing), which is Intel's characterization of IA-64's operation.

The question elicited a roar of laughter from the audience. However, Mulder
didn't take the bait. "That's a good one," he answered soberly. "EPIC is a
combination of techniques that goes beyond VLIW-particularly the
speculation and predication mechanisms.

"I think there's a set of techniques that, if you bring them all together, you can
call them EPIC. It's really to signify that it's [something] more. It's not just
providing fixed-width machines, as in VLIW. It also give you a lot of features
that have never been used for scalar processing."

Beyond the purely technical, Mulder admitted that the EPIC name was part of
a marketing campaign. "Now it's an accepted acronym," he said. "Maybe not
yet in the academic world, but clearly in the trade press."

Copyright ® 1999 CMP Media Inc
February 15, 1999, Issue: 1048
Section: Systems & Software

Overloading on Merced
Alexander Wolfe

When it comes to reporting on Merced, sometimes there can be too much of
a good thing. Take Hans Mulder, Intel principal engineer, who made a lot of
interesting points in his keynote talk at the recent Micro-31 conference. They
wouldn't all fit into the story that's running on page 43 of this issue. So I'm
going to continue Mulder's mullings here.

Most interesting was the discussion, during the question-and-answer session
following Mulder's talk, of Merced successor McKinley. The latter processor
is due in late 2001. Mulder was asked how McKinley would support higher
levels of instruction-level parallelism than Merced.

"The key is that we've added a few more instruction units," he explained.
"Now, the interesting thing is, that doesn't necessarily mean that we'll execute
wider instruction bundles [i.e., longer words]. Because it's also possible to
make better use of the existing instruction templates-you can feed more
functional units off of them."

McKinley opens up a can of worms, because VLIW-like (and presumably
EPIC) architectures have had difficulty maintaining software compatibility as
they add more functional units. Mulder said that won't be a problem, because
IA-64 is defined in such a way that "you're always binary-compatible."
Presumably, this will require a heavy dose of dynamic scheduling.

In that regard, Mulder also gave a nod to the importance of software in
making IA-64 a success. "It's very clear that the basic philosophy is that
compilers provide better performance by finding parallelism in the software."

Returning to his main theme, which seemed to be that IA-64 takes the best
from all architectural styles to deliver the best to all possible users, Mulder
made some closing remarks. "The key aspect of IA-64 is that it will help you
with scalar code," he said. "The reason it will do so well on scientific code is
because we threw all those functional units in. But [our use] of predication and
speculation will help greatly with scalar code."

(Predication removes unnecessary branches from an application program,
while speculation masks memory latency by executing load instructions as
soon as possible.)

Asked if there were other mechanisms besides predication and speculation
that would help with instruction prefetching in IA-64, Mulder had a succinct
answer. "Yes," he said.

Copyright ® 1999 CMP Media Inc.



To: Paul Engel who wrote (73648)2/14/1999 4:32:00 PM
From: Paul Engel  Read Replies (1) | Respond to of 186894
 
Intel Investors - Two More Patents issued to Intel and HP (IDEA) concerning Merced/EPIC implementation.

This is quite technical, but indicative of the advances Intel and HP are making for this new architecture.

Paul

{=====================}

techweb.com

February 15, 1999, Issue: 1048
Section: Systems & Software

Techniques of predication and speculation detailed --
Patents shed light on Merced
Alexander Wolfe

Santa Clara, Calif. - Two newly issued patents appear to provide the deepest insight yet available into the inner workings of Intel Corp.'s upcoming Merced microprocessor.

Word of the patents comes at a timely moment, as Intel gears up for its Developer's Conference, to be held in Palm Springs, Calif., Feb. 23-25. Intel officials said they will brief developers at the conference on some heretofore undisclosed features of Merced and its companion IA-64 architecture. Those will include at least a peek at some detail of the processor's 64-bit instruction
set as well as some additional data on its registers.

The briefing will further Intel's strategy of stoking interest in IA-64 by releasing incremental bits of information every few months. Most recently, at the Microprocessor Forum conference in San Jose, Calif., last November, Intel displayed the first high-level block diagram of Merced to be seen in public.

Twin foundations

At next week's event, Intel officials are expected to delve further into predication and speculation, which are the twin foundations of IA-64. Predication removes branches from code by essentially executing both pre- and post-branch instructions at the same time. Then, the results from
instructions that wouldn't have been executed during a real-world sequential run through the code are thrown out.

Speculation masks memory latency by essentially yanking load instructions out of their normal place in the middle of a branch and bringing them forward to be initiated as early as possible in the program flow.

The two new patents, obtained by EE Times, appear to provide more insight into operational details behind speculation and predication than Intel is expected to disclose at its developer's conference.

"They are definitely both Merced patents," said Rich Belgard, a microprocessor-patent expert based in Saratoga, Calif. "They discuss implementation details. They're interesting patents but not fundamental."

The first patent, No. 5,860,017, was issued to Intel on Jan. 12 and is titled "Processor and method for speculatively executing instructions from multiple instruction streams indicated by a branch instruction." (One of the inventors, Hans Mulder, recently spoke about IA-64; see sidebar at right.)

Interestingly, the second patent was issued to the Institute for the Development of Emerging Architectures (Idea Corp.; Cupertino, Calif.). Idea is widely acknowledged to be a joint company set up by Intel and
Hewlett-Packard as an intellectual-property repository in a bid to make some of their IA-64 patents less accessible to competitive eyes. (HP jointly
developed the IA-64 instruction set with Intel.)

That second patent, No. 5,859,999, was also issued on Jan. 12 and is titled "System for restoring predicate registers via a mask having at least a single bit corresponding to a plurality of registers."

Inner operations

The first patent probably sheds more light on how Merced actually implements
predication while processing multiple instruction streams.

Due to ship in 2000, Merced is the first processor to implement Intel's IA-64
architecture and its Explicitly Parallel Instruction Computing (EPIC)
technology. Merced itself has 128 registers, each 64 bits wide. There are also
64 1-bit predicate registers.

EPIC has been billed as a new approach that applies some concepts from
very long-instruction-word computing but that is altogether different from
VLIW. Essentially, EPIC is intended to enable Merced to handle a large
number of instructions and feed them to multiple on-chip functional units for
execution on every clock cycle.

Patent 5,860,017 details how multiple instruction streams can be speculatively
executed, which is the tough task faced by Merced. Speculative execution on
its own is difficult; speculatively executing more than one instruction stream is
much, much tougher.

The concept of speculative execution isn't new, but the patent puts some
twists on the technique. "What's old in the [patent] art is going down both
paths of a branch and speculatively fetching and executing both paths of a
conditional branch," said patent expert Belgard. "What's new is that they only
do that when their branch-prediction logic is unlikely to have predicted the
branch correctly."

Incorrect guesses will occur more often when multiple instruction streams are
in play-that is, when the CPU is proceeding down two paths at once.

"Typically, when you get a conditional branch, you rely on the branch
predictor to figure out which path to take-either the 'taken' path or the
'not-taken' path," Belgard said. "In this case, they have an extra indicator that
says, 'Our branch predictor is unlikely to be correct because of the history.'
And if it is unlikely to be correct, then they speculatively execute both paths of
the branch until they determine [the condition by which] they can throw one
away."

Since one important purpose behind patents is to provide companies with a
legal foundation from which they can protect their products, many applications
make substantial claims for the device at issue. (Claims are the meat of a
patent, detailing the invention the company states it has made.)

That's the case in patent 5,860,017. In summary form, Intel is essentially
setting forth as its first claim that it has come up with "a microprocessor for
efficient processing of instructions in a program flow including conditional
instructions, such as a branch." However, the dependent claims focus on more
specific innovations such as details of stream-management logic and
branch-prediction logic.

The second newly issued Merced-related patent concerns predicate registers.
The processor essentially uses those registers to keep track of what's
happening where during speculative execution.

Tips details

The patent is decidedly arcane, but it does offer up some new evidence of
Merced's internals. "It's interesting in that it shows that the Merced instructions
appear to be 41 bits wide," noted Belgard.

According to a diagram included with the patent, bits 0 through 5 for the
instruction word contain a field called "controlling predicate." Bits 6 through
12 hold a 7-bit mask. Bits 13 through 19 are devoted to "restore from general
register."

The functions of bits 20 through 23, and of bit 32, are not indicated.

Bits 24 through 31 hold an 8-bit mask. Bits 33 through 36 contain the
sub-opcode. Bit 38 appears to be the predicate-register mask bit. Bits 37
through 40 are allocated to "opcode restore predicate registers."

Copyright ® 1999 CMP Media Inc.