SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : ASML Holding NV
ASML 1,015-1.4%3:11 PM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
From: BeenRetired10/11/2025 7:46:46 AM
  Read Replies (1) of 42347
 
IBM Introduces the [5nm, 128GB RAM] Spyre Accelerator for Commercial Availability

Shannon Davis
4 days ago


IBM today announced the upcoming general availability of the IBM Spyre Accelerator, an AI accelerator enabling low-latency inferencing to support generative and agentic AI use cases while prioritizing the security and resilience of core workloads. Earlier this year, IBM announced the Spyre Accelerator would be available in IBM z17, LinuxONE 5, and Power11 systems. Spyre will be generally available on October 28 for IBM z17 and LinuxONE 5 systems, and in early December for Power11 servers.

Today’s IT landscape is changing from traditional logic workflows to agentic AI inferencing. AI agents require low-latency inference and real-time system responsiveness. IBM recognized the need for mainframes and servers to run AI models along with the most demanding enterprise workloads without compromising on throughput. To address this demand, clients need AI inferencing hardware that supports generative and agentic AI while maintaining the security and resilience of core data, transactions, and applications. The accelerator is also built to enable clients to keep mission-critical data on-prem to mitigate risk while addressing operational and energy efficiency.

The IBM Spyre Accelerator reflects the strength of IBM’s research-to-product pipeline, combining breakthrough innovation from the IBM Research AI Hardware Center with enterprise-grade development from IBM Infrastructure. Initially introduced as a prototype chip, Spyre was refined through rapid iteration, including cluster deployments at IBM’s Yorktown Heights campus, and with collaborators like the University at Albany’s Center for Emerging Artificial Intelligence Systems.

The IBM Research prototype has evolved into an enterprise-grade product for use in IBM Z, LinuxONE and Power systems. Today, the Spyre Accelerator is a commercial system-on-a-chip with 32 individual accelerator cores and 25.6 billion transistors. Produced using 5nm node technology, each Spyre is mounted on a 75-watt PCIe card, which makes it possible to cluster up to 48 cards in an IBM Z or LinuxONE system or 16 cards in an IBM Power system to scale AI capabilities.

“One of our key priorities has been advancing infrastructure to meet the demands of new and emerging AI workloads,” said Barry Baker, COO, IBM Infrastructure & GM, IBM Systems. “With the Spyre Accelerator, we’re extending the capabilities of our systems to support multi-model AI – including generative and agentic AI. This innovation positions clients to scale their AI-enabled mission-critical workloads with uncompromising security, resilience, and efficiency, while unlocking the value of their enterprise data.”

“We launched the IBM Research AI Hardware Center in 2019 with a mission to meet the rising computational demands of AI, even before the surge in LLMs and AI models we’ve recently seen,” said Mukesh Khare, GM of IBM Semiconductors and VP of Hybrid Cloud, IBM. “Now, amid increasing demand for advanced AI capabilities, we’re proud to see the first chip from the Center enter commercialization, designed to deliver improved performance and productivity to IBM’s mainframe and server clients.”

For IBM clients, Spyre Accelerators offer fast, secured processing with on-prem AI acceleration. This marks a significant milestone, allowing businesses to leverage AI at scale while keeping data on IBM Z, LinuxONE and Power systems. In mainframe systems, coupled with the Telum II processor for IBM Z and LinuxONE, Spyre offers enhanced security, low latency, and high transaction rate processing power. Leveraging this advanced hardware and software stack, businesses can use Spyre to scale multiple AI models to power predictive use cases such as advanced fraud detection and retail automation.

On IBM Power-based servers, Spyre customers can leverage a catalog of AI services, enabling end-to-end AI for enterprise workflows. Clients can install the AI services from the catalog with just one click.1 Spyre Accelerator for Power, combined with an on-chip accelerator (MMA), also accelerates data conversion for generative AI to deliver high throughput for deep process integrations. Additionally, with a prompt size of 128, it enables the ingestion of more than 8 million documents for knowledge base integration in an hour2. This performance, combined with the IBM software stack, security, scalability, and energy efficiency, supports clients on their journey to integrating generative AI frameworks into their enterprise workloads.

Copilot:
Yes—the IBM Spyre Accelerator is fabricated on a 5nm node. This was confirmed in IBM’s October 2025 press release and supporting coverage of the chip’s commercial launch:

Spyre Node Size and Architecture
  • Node: 5nm CMOS
  • Transistor Count: 25.6 billion
  • Cores: 32 AI-optimized inference cores
  • Form Factor: 75-watt PCIe card
  • Memory: 128 GB LPDDR5
  • Performance: ~300 TOPS (8-bit equivalent)
This puts Spyre in the same node class as Apple’s M1/M2, NVIDIA’s Hopper H100, and AMD’s MI300A, though its architecture is tuned for low-latency, secure inference in enterprise systems—not general-purpose AI training.

Why 5nm Makes Sense

IBM’s choice of 5nm reflects a balance between:
  • Power efficiency (critical for 75W PCIe cards)
  • Density (to pack 32 inference cores and 128GB memory controller)
  • Maturity (5nm is now a stable node for high-yield enterprise chips)
It also aligns with IBM’s Telum II CPU, which powers z17 and LinuxONE 5 systems and is also built on 5nm. This co-node synergy simplifies integration and thermal design.

Strategic Implication

Spyre’s 5nm foundation enables on-prem, real-time AI inference for regulated industries—where data locality, latency, and security matter more than raw FLOPs. It’s not chasing NVIDIA’s training crown, but carving a niche in enterprise AI acceleration.

Want to explore how this might pair with chiplet-based inference fabrics or secure edge inference nodes?
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext