"hyperscalers commit to annual releases through the decade"
The artificial intelligence industry is developing so rapidly that the leading suppliers of AI accelerators — AMD and Nvidia — have moved to a yearly product release cadence. Furthermore, it appears that hyperscalers who can afford to develop their own silicon followed suit, so Amazon Web Services, Google, and Meta are also going to release new AI accelerators every year through to the late 2020s.
However, things are going to change for AMD with the Instinct MI400-series, set to land sometime in the second half of 2026. The upcoming MI450X will focus on AI workloads, while the MI430X will target traditional supercomputing applications. Both processors are expected to be made using TSMC's N2 (2nm-class) fabrication process, packaged using CoWoS-L technology, and equipped with HBM4 memory.
Amazon exclusively uses its AI accelerators at its own data centers, so the company does not disclose too many details about its two chips---just imagine my surprise---. Amazon uses its Trainium chips for both training and inference, its Inferentia chips solely for inference workloads.
After Ironwood (TPU v7p), Google is expected to release its 8th Generation TPUs — v8p and v8e — which are rumored to be made on TSMC's 3nm-class process technology and feature up to 288 GB of HBM3E memory. So, do not expect a major performance increase from these parts. Google's v8p and v8e accelerators are slated for 2026.
However, that deployment seems limited to internal workloads and was not necessarily at the scale one would expect for a full infrastructure shift. The company's more recent MTIA chips are made on TSMC's 5nm-class fabrication process and double onboard memory to 128 GB (MTIA 2) and 256 GB (MTIA 2.5). However, the company will get more aggressive with subsequent generations of MTIA.
Meta's MTIA v3 — due in 2026 — is projected to be a considerably higher-performance solution, as it's expected to use a compute chiplet made on TSMC's N3 fabrication process, and is expected to use HBM3E memory. The company is also expected to release MTIA v4 in 2027. This accelerator will likely use two or more chiplets fabbed on TSMC's 2nm fabrication process and equipped with HBM4 memory.
The company is currently working on its next-generation Maia processors: the codenamed Braga (Maia 200?) chip will use TSMC's 3nm node and HBM4 memory. Braga is allegedly due in 2026, with its successor, Clea (Maia 300?), due at a later date. However, considering the limited adoption of Maia, Microsoft might want to recalibrate the positioning of its own AI accelerators to reduce complexity and risk.
Nvidia's upcoming data center GPU carries the codename Rubin, and is expected in late 2026. The initial model, tentatively referred to as R100/R200, will consist of two reticle-sized GPU dies and two dedicated I/O chiplets, all built using TSMC's 3nm-class node, likely N3P or something customized for Nvidia's needs. Memory-wise, it will integrate 288?GB of HBM4 across eight stacks, each running at 6.4?GT/s, yielding an impressive ~13?TB/s of total memory bandwidth.
Recently, Broadcom confirmed that an undisclosed client intends to procure $10 billion worth of custom AI processors, which are set to be delivered in the third quarter of 2026. While the industry believes that the product in question is OpenAI's first custom AI processor, this has never been formally confirmed---Broadcom publicly said that's someone different...and they have an OpenAI order --- .
Silicon Investor (SI) -- The First Internet Community Full article through Immersive Reader tab.
PS Moore dead? Hardly. Actually? Steroid junky. With More Than Moore? Bit bonanza to '30.
ASML Village |