SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Politics : Formerly About Advanced Micro Devices -- Ignore unavailable to you. Want to Upgrade?


To: Cirruslvr who wrote (74100)10/5/1999 7:46:00 PM
From: dumbmoney  Respond to of 1572362
 
The AMD Athlon™ Processor:
Future Directions
Fred Weber
Vice President, Engineering
Computation Products Group

Agenda
 AMD Athlon™ Processor
 Current Status
 Workstation and Server Features

 Key Technologies For the Future
 I/O and MP Scalability and Flexibility
 64 bit Computing and Processor Performance

AMD Athlon™ Processor Today
 Microprocessor Forum 1998: AMD Discloses AMD Athlon processors for shipment in 1H99 at >500MHz
 AMD ships the AMD Athlon processor in June @ 500, 550, & 600 MHz
 AMD Athlon 650 MHz processor introduced and shipped in August
 Reviews confirm AMD Athlon processor has the fastest x86 integer, floating point and multimedia performance
 A 700 MHz AMD Athlon processor was announced on October 4, ‘99
 0.18 micron AMD Athlon processor is in production and sampling now

Workstation & Server Capabilities
 200 MHz FSB (1.6GB/sec) that can scale to 400 MHz (3.2GB/sec) per processor
 Backside L2 Cache interface supports up to 8MB
 Up to 24 Outstanding Transactions per Processor
 13-Pin Address Bus
 13-Pin Snoop Bus
 72-Pin Data Bus w/ ECC
 Scales to 43-bit Physical Address

 Processor Scalability
 1MB and 2MB Full Speed, 16-way associative L2 Cache
 266MHz Front Side Bus
 System Scalability: 2-way; 4 to 8 way multiprocessing
 AMD: 2 processor design
 266MHz Front Side Bus with DDR DRAM (PC-2100™ )
 4X AGP-Pro, PCI 66/64
 Multi-way: Infrastructure development underway (API and HotRail)
 Reliability
 ECC Protected L2 Cache, DRAM and Front Side Bus
 Execution Signature Generation

AMD's System Bus Initiative:
Lightning Data Transport (LDT)
 Goals
 Simplify design and flexibility with a single data link for “in-chassis” connection to I/O, multi-processing and co-processors
 Improve system performance with increased I/O performance and scalable bandwidth
 Enable flexibility of system I/O technologies through a modular bridge architecture
 Complement externally visible bus standards

AMD's LDT: I/O
 I/O can be daisy chained
 Multiple bridges on a single I/O link
 Multiple “pass through” devices can be interconnected
 Bridges are independent (reusable for many designs)
 The System I/O SANIC (HCA) is independent of the memory controller

AMD's LDT: Multiprocessing
LDT Features
 Unidirectional point-to-point links in each direction
 Differential signaling with source synchronous clock forwarding
 Variable widths negotiated at initialization
 Upstream and downstream links can be of different size
 16/16-bit link provides 6.4 GB/sec each way
 Multiple logical channels in each link
 Guaranteed isochronous bandwidth
 In-band system management and legacy signal transport
 PCI like configuration mechanism

64-bit Computing
a
Compatible Approach
x86-64

64-Bit Computing
 The purpose of 64-bit computing: Enable large memory OS and applications
 AMD has defined and will deliver x86-64
 Extend x86 architecture to 64 bits
 5% die area cost, minimal design complexity cost
 Leverage all existing development tools and knowledge
 Key Benefits
 Core performance will be state of the art for both 32-bit and 64-bit applications (they are the same core)
 Allows a migration from 32-bit to 64-bit to be seamless and at the user's pace

The Last Diversion:
RISC vs CISC
 RISC (and other) ideas brought to x86
 Emphasis on frequency and time to market
 Non-microcoded execution
 Pipelined execution
 Superscalar execution with renaming
 Out of order load/store, prefetch
 The gap has been closed for integer performance
 The gap will be closed for floating point performance within a generation

RISC vs x86 Performance
Architectural Performance
 The Heavy Hitters
 Frequency
 Power
 Thread level parallelism
 The Rest of the Story - Pursue Wisely
 Operation (especially memory) latency
 Code stream predictability
 Code path length
 Instruction level parallelism
 Instruction Set is one of the Weaker Tools

The Heavy Hitters
 Frequency is the tide that lifts all boats
 The frequency limitations of x86 have been largely solved
• Variable length instructions
• Implicit flags
 Further pipelining will further increase frequency
 New architectures introduce new frequency limiters
 Power Limits Frequency
 Excessive speculation wastes power
 Large resources waste power
 AMD plans to deploy multiple x86-64 processors on a single die
 Multiprocessing is finally real, especially for server applications
 Uniprocessors must have leading performance
 But, they must also be area and power efficient to enable on-die MP

Instruction Set: Fixes & Tweaks
 A simple change fixes the one broken aspect of the IA32 - The x87 FPU register stack
 AMD solved the problem for multimedia floating point with 3DNow!™ technology
 AMD will introduce technical floating point instructions (TFP) with x86-64
• 3 operand, register to register instruction formats
• Double Precision IEEE operations
 Should enable closing the SPECfp performance gap
 Instruction encoding added within existing instruction set
• Like MMX, 3DNow!™ technology and SSE
 Minor additions to the instruction set
• Prefetch, Specialized Operations

The x86-64 Instruction Set
 All existing x86 modes and segments are compatible
 64-bit code segments include
 x86 and x87 instructions are supported
 No segment base or limit registers
 Default 64-bit addressing
• Override for 32-bit addressing
 Default 8 and 32-bit data operations
• Most data remains 32-bit
• Overrides for 64 and 16-bit operations
• 32-bit operations sign extend to 64, others preserve bits [63:32]
 Immediates and displacements are 8 and 32-bits
• Provide code density
• 64-bit Immediate for MOV immediate to register

Performance AND Compatibility
 How do these 64-bit solutions try to maintain compatibility?
 Emulation (DEC/Compaq's)
 x86 as a unique processor
• (SUN's: PCI x86 coprocessor card)

 Two different instruction set based processors
on one single die

 In each case, any x86 application is relegated to second class status
 Performance will always lag
 Application developers must decide when to migrate from x86

Summary
 AMD Athlon processors are providing performance leadership
 AMD will extend leadership performance
 LDT provides the bandwidth I/O and SIO require
 LDT simplifies and improves MP solutions
 x86-64 provides a compatible future with uncompromised performance
 Compatibility is the key
 Preserve the industry's and your customer's investment