To: Cirruslvr who wrote (74100 ) 10/5/1999 7:46:00 PM From: dumbmoney Respond to of 1572362
The AMD Athlon™ Processor: Future Directions Fred Weber Vice President, Engineering Computation Products Group Agenda  AMD Athlon™ Processor  Current Status  Workstation and Server Features  Key Technologies For the Future  I/O and MP Scalability and Flexibility  64 bit Computing and Processor Performance AMD Athlon™ Processor Today  Microprocessor Forum 1998: AMD Discloses AMD Athlon processors for shipment in 1H99 at >500MHz  AMD ships the AMD Athlon processor in June @ 500, 550, & 600 MHz  AMD Athlon 650 MHz processor introduced and shipped in August  Reviews confirm AMD Athlon processor has the fastest x86 integer, floating point and multimedia performance  A 700 MHz AMD Athlon processor was announced on October 4, ‘99  0.18 micron AMD Athlon processor is in production and sampling now Workstation & Server Capabilities  200 MHz FSB (1.6GB/sec) that can scale to 400 MHz (3.2GB/sec) per processor  Backside L2 Cache interface supports up to 8MB  Up to 24 Outstanding Transactions per Processor  13-Pin Address Bus  13-Pin Snoop Bus  72-Pin Data Bus w/ ECC  Scales to 43-bit Physical Address  Processor Scalability  1MB and 2MB Full Speed, 16-way associative L2 Cache  266MHz Front Side Bus  System Scalability: 2-way; 4 to 8 way multiprocessing  AMD: 2 processor design  266MHz Front Side Bus with DDR DRAM (PC-2100™ )  4X AGP-Pro, PCI 66/64  Multi-way: Infrastructure development underway (API and HotRail)  Reliability  ECC Protected L2 Cache, DRAM and Front Side Bus  Execution Signature Generation AMD's System Bus Initiative: Lightning Data Transport (LDT)  Goals  Simplify design and flexibility with a single data link for “in-chassis” connection to I/O, multi-processing and co-processors  Improve system performance with increased I/O performance and scalable bandwidth  Enable flexibility of system I/O technologies through a modular bridge architecture  Complement externally visible bus standards AMD's LDT: I/O  I/O can be daisy chained  Multiple bridges on a single I/O link  Multiple “pass through” devices can be interconnected  Bridges are independent (reusable for many designs)  The System I/O SANIC (HCA) is independent of the memory controller AMD's LDT: Multiprocessing LDT Features  Unidirectional point-to-point links in each direction  Differential signaling with source synchronous clock forwarding  Variable widths negotiated at initialization  Upstream and downstream links can be of different size  16/16-bit link provides 6.4 GB/sec each way  Multiple logical channels in each link  Guaranteed isochronous bandwidth  In-band system management and legacy signal transport  PCI like configuration mechanism 64-bit Computing a Compatible Approach x86-64 64-Bit Computing  The purpose of 64-bit computing: Enable large memory OS and applications  AMD has defined and will deliver x86-64  Extend x86 architecture to 64 bits  5% die area cost, minimal design complexity cost  Leverage all existing development tools and knowledge  Key Benefits  Core performance will be state of the art for both 32-bit and 64-bit applications (they are the same core)  Allows a migration from 32-bit to 64-bit to be seamless and at the user's pace The Last Diversion: RISC vs CISC  RISC (and other) ideas brought to x86  Emphasis on frequency and time to market  Non-microcoded execution  Pipelined execution  Superscalar execution with renaming  Out of order load/store, prefetch  The gap has been closed for integer performance  The gap will be closed for floating point performance within a generation RISC vs x86 Performance Architectural Performance  The Heavy Hitters  Frequency  Power  Thread level parallelism  The Rest of the Story - Pursue Wisely  Operation (especially memory) latency  Code stream predictability  Code path length  Instruction level parallelism  Instruction Set is one of the Weaker Tools The Heavy Hitters  Frequency is the tide that lifts all boats  The frequency limitations of x86 have been largely solved • Variable length instructions • Implicit flags  Further pipelining will further increase frequency  New architectures introduce new frequency limiters  Power Limits Frequency  Excessive speculation wastes power  Large resources waste power  AMD plans to deploy multiple x86-64 processors on a single die  Multiprocessing is finally real, especially for server applications  Uniprocessors must have leading performance  But, they must also be area and power efficient to enable on-die MP Instruction Set: Fixes & Tweaks  A simple change fixes the one broken aspect of the IA32 - The x87 FPU register stack  AMD solved the problem for multimedia floating point with 3DNow!™ technology  AMD will introduce technical floating point instructions (TFP) with x86-64 • 3 operand, register to register instruction formats • Double Precision IEEE operations  Should enable closing the SPECfp performance gap  Instruction encoding added within existing instruction set • Like MMX, 3DNow!™ technology and SSE  Minor additions to the instruction set • Prefetch, Specialized Operations The x86-64 Instruction Set  All existing x86 modes and segments are compatible  64-bit code segments include  x86 and x87 instructions are supported  No segment base or limit registers  Default 64-bit addressing • Override for 32-bit addressing  Default 8 and 32-bit data operations • Most data remains 32-bit • Overrides for 64 and 16-bit operations • 32-bit operations sign extend to 64, others preserve bits [63:32]  Immediates and displacements are 8 and 32-bits • Provide code density • 64-bit Immediate for MOV immediate to register Performance AND Compatibility  How do these 64-bit solutions try to maintain compatibility?  Emulation (DEC/Compaq's)  x86 as a unique processor • (SUN's: PCI x86 coprocessor card)  Two different instruction set based processors on one single die  In each case, any x86 application is relegated to second class status  Performance will always lag  Application developers must decide when to migrate from x86 Summary  AMD Athlon processors are providing performance leadership  AMD will extend leadership performance  LDT provides the bandwidth I/O and SIO require  LDT simplifies and improves MP solutions  x86-64 provides a compatible future with uncompromised performance  Compatibility is the key  Preserve the industry's and your customer's investment