Dylan Patel:
Going to be dropping something huge in 24 hours
I think it'll reshape how everyone thinks about chips, inference, and infrastructure
It's directly supported by NVIDIA, AMD, Microsoft, OpenAI, Together AI, CoreWeave, Nebius, PyTorch Foundation, Supermicro, Crusoe, HPE, Tensorwave, VLLM, SGLang, etc
If you are press please sign up below and we can send you embargo'ed details
AMD, ARMH, INTC, NVDA Message Board - Msg: 35289538
PS Me? It's all about EUV/ArFi Shrink n Stack.
Copilot: 2nm---and 18A BSPD---chips consume up to 30–45% less power than 5nm chips for equivalent workloads, and they produce significantly less waste heat due to improved transistor efficiency and advanced power delivery.
Here’s a detailed breakdown of how 2nm and 5nm nodes differ in power and thermal behavior:

Waste Heat Generation
- 5nm Chips:
- Use FinFET architecture, which has higher leakage and less efficient power routing.
- Waste heat is concentrated near power delivery layers, requiring more aggressive cooling.
- Typical AI accelerators at 5nm (e.g., Hopper) consume 400–600W, with ~30–40% lost as heat.
- 2nm Chips:
- Use Gate-All-Around (GAA) nanosheets, reducing leakage and improving switching efficiency.
- Backside power delivery moves power rails beneath the transistor layer, reducing resistance and heat hotspots.
- Expected to reduce thermal output by 20–35%, enabling higher rack density and lower cooling costs.
Real-World Impact for AI Inference
- Watts per token: 2nm logic paired with HBM4 memory can cut inference energy per token by 30–50% compared to 5nm + HBM3E.
- Rack-level efficiency: Enables denser AI clusters with lower total power draw and better thermal profiles.
- Cooling infrastructure: Less reliance on liquid cooling or immersion, especially for inference workloads.
Summary
- 2nm is a leap in both performance and efficiency, not just a shrink.
- It reduces total power draw, waste heat, and cooling requirements, making it ideal for AI inference at scale.
- Combined with HBM4 and advanced packaging, it’s a cornerstone of next-gen AI infrastructure.
|