K7 Spec:
AMD-K7(TM) Processor Architecture
Three Parallel x86 Instruction Decoders 9-issue Superscalar Microarchitecture Optimized for High Frequency Dynamic Scheduling with Speculative, Out-of-Order Execution 2048-entry Branch Prediction Table & 12-entry Return Stack 3 Superscalar, Out-of-Order Integer Pipelines each Containing: Integer Execution Unit Address Generation Unit 3 Superscalar, Out-of-Order Multimedia Pipelines with 1-cycle throughput FADD (4 cyc latency), MMX ALU (2 cyc latency), 3DNow! FMUL (4 cyc latency), MMX ALU (includes Mul & MAC), 3DNow! FSTORE Level 1 64K I-Cache & 64K D-Cache, each 2-way Set Associative Multi-level TLB (24/256-Entry I, 32/256-Entry D)
Two General Purpose 64-bit Load/Store Ports into D-Cache 3-Cycle Load Latency Multi-banking Allows Concurrent Access by 2 Load/Stores High-speed 64-bit Backside L2 Cache Controller Supports Sizes of 512KB to 8MB Programmable Interface Speeds High-speed 64-bit System Interface First Mainstream Systems to have a 200MHz Bus Significant Headroom for Future Deep Internal Buffering to Support Pipelines and External Interfaces Up to 72 x86 instructions in-flight 32 outstanding load misses 15-entry integer scheduler 36-entry floating point scheduler
x86 Instructions are sent to one of two Decoding Pipelines: DirectPath: Decodes common x86 instructions (1-15 byte lengths) VectorPath: Decodes uncommon, complex x86 instructions Decoding Pipelines can dispatch 3 MacroOps to Execution Unit Schedulers Each MacroOp consists of one or two Operations (OPs) OPs are issued to the execution units
Integer Execution Units
Three Integer Execution Units (IEU) Three Address Generation Unit (AGU) 15-entry Integer Scheduler Full Out-of-Order Speculative Execution Multiplier
Superscalar Multimedia Execution Units
Three Superscalar Multimedia Execution Units 3-issue, Out-of-Order, Fully Pipelined Design Separate Register File
Load-Store Unit and Data Cache
Load Store Unit (LSU) 44-entry Load/Store queue Data forwarding from stores to dependent loads 2-way, 64KB Dual-Ported Data Cache MOESI coherency, 64 byte line size 32-entry L1 DTLB and 4-way, 256-entry L2 DTLB 3 sets of data cache tags
System and L2 Cache Interfaces
Alpha EV6 Bus Protocol Point-to-Point Topology with Clock Forwarding Decoupled Address and Data Busses 72-bit Data Bus w/ ECC Independent Address/Request Bus Independent Snoop Bus Up to 20 Outstanding Transactions per Processor Scalable Multiprocessing L2 Cache Interface 512KB to 8MB using Industry- Standard SRAMs Programmable Interface Speeds Low-voltage Signaling
AMD-K7(TM) Processor Summary
Superior 7th Generation Processor Architecture Advanced Processor Core Design Leading Edge Frequencies: 500MHz+ using 0.25mm Technology High Performance System Interface with low-voltage swing Point-to-Point Topology and Clock Forwarding Technology Scalable Multiprocessing Architecture AMD-K7 Processor Module, Chipsets, Motherboards Leading Edge Silicon Technology Fab 25 and Fab 30 Provide Volume Production Capacity |