SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Intel Corporation (INTC) -- Ignore unavailable to you. Want to Upgrade?


To: Elmer who wrote (145414)10/16/2001 12:38:53 PM
From: Jim McMannis  Respond to of 186894
 
RE:"Why do you consider that a dove attitude? What would a hawk do?"

Exactly what the FBI is doing. Find the planes, the buyer. Hold the planes until more is known.

OTOH, foreigners come here to buy aircraft all the time...
Could be nothing, could be something.

Jim



To: Elmer who wrote (145414)10/16/2001 1:40:52 PM
From: Paul Engel  Read Replies (2) | Respond to of 186894
 
Elmer - A glimpse into Intel's Hyperthreading implementation and potential benefits:

So rather than use that "basic trigger" for the new threads, Intel researchers developed a way for threads to beget more threads. Intel claims the speedup gained from creating such thread chains averages 76 percent but can range up to 169 percent.

"We get a significant speedup using these chain triggers," Shen said. "If you have two logical processors, you can get 30 percent speedup. If you can do more than two logical processors — say, eight threads — you can now see very significant a speedup by using speculative precomputation threads."


I assume the patent applications for these developments are very large in number !

Paul
{============================}

Intel looks to bridge gap in multithreading CPU landscape

By Anthony Cataldo, EE Times
Oct 16, 2001 (10:29 AM)
URL: eetimes.com

SAN JOSE, Calif. — With instruction-level parallelism out of fashion and thread-level parallelism the buzzword du jour, Intel Corp. is proposing "pseudo-parallelism" as the next step in microprocessor design. The approach lets a CPU force single-threaded applications to act as if they have multiple threads.

The new wrinkle in parallelism, which Intel discussed here at the Microprocessor Forum, builds on a multithreading scheme called hyperthreading that Intel disclosed earlier this year. Hyperthreading, a latent feature in the P4 architecture that will be activated first in a Xeon processor next year, allows one CPU to act as two logical processors when it encounters applications that are split into separate threads.

But with the exception of such server applications as database management software, most applications can't take advantage of hyperthreading, because they are still single-threaded. That presents a problem for Intel, which plans eventually to deploy hyperthreading in desktop systems. With pseudo-parallelism — more formally known as speculative precomputation — Intel exploits this "second," latent processor in single-threaded applications that would otherwise have remained idle.

Intel is looking to apply the new form of parallelism to one of the most vexing trouble spots in microprocessors: memory access. Cache misses in local cache memory are getting harder for CPU designers to swallow as the penalties worsen for accessing external DRAM. An early Pentium at 66 MHz lost only 70 instruction cycles for a DRAM access. But when processors reach 5 to 10 GHz, the number of instructions required to retrieve data from DRAM will be measured in the thousands of cycles.

"It reminds me of what disk latencies used to be 10 or 20 years ago," said Glenn Hinton, Intel fellow and director of IA-32 architecture development.

Pseudo-parallelism attacks memory latency by minimizing cache misses, thus reducing the frequency of accesses to main memory. Intel has identified 10 static loads that do not lend themselves to prefetching and that are susceptible to stalls, either because they have too many dependencies or because they don't otherwise exhibit predictable access patterns. It is those "delinquent loads" that cause 80 to 90 percent of the cache misses.

"We're looking at lots of pointer chasing that induces L2 and L3 cache misses," said John Shen, director of Intel's Advanced Architecture Labs.

Speculative precomputation works by spawning a new thread in an otherwise single-threaded application when an instruction reaches a certain stage in a pipeline. That is done by attaching code to the tail end of an existing binary; recompiling the code is unnecessary. When a thread is triggered, the second, idle logical processor comes to life and performs the cache prefetching.

"The objective is that the speculative-precomputation thread will trigger cache accesses much earlier than the main thread that encounters the delinquent load. We're trying to mask or eliminate all the cache miss latencies," Shen said.

Early experiments with speculative precomputation backfired: Instead of helping CPU performance, the technique slowed it down because the pipeline had to be flushed out every time a new thread was spawned, eating more CPU cycles, said Shen. So rather than use that "basic trigger" for the new threads, Intel researchers developed a way for threads to beget more threads. Intel claims the speedup gained from creating such thread chains averages 76 percent but can range up to 169 percent.

"We get a significant speedup using these chain triggers," Shen said. "If you have two logical processors, you can get 30 percent speedup. If you can do more than two logical processors — say, eight threads — you can now see very significant a speedup by using speculative precomputation threads."


By taking advantage of the extra thread produced in hyperthreading, the technique remains consistent with Intel's overall goal of staying within strict power and die size budgets. In the strictest sense, a hyperthreaded CPU is not full-fledged multithreaded machine, because it cuts most processor resources in half to accommodate two threads instead of duplicating the hardware. But it derives better performance by behaving like two logical processors, with only 5 percent more die area and power consumption.

One alternative to speculative precomputation would be to try to get even more instruction-level parallelism using superscalar techniques. But that approach is reaching its practical limits from a power and die size point of view.

"There are a lot of execution units not highly utilized, because of instruction dependencies," Hinton said. "With two processors, the peak execution is six instructions per clock for each processor, but each processor is not necessarily using all the execution resources.

"It still takes twice the power and twice the die size when only one thread is available. Half the hardware is completely idle."

www.cmpnet.com
The Technology Network

Copyright 1998 CMP Media Inc