VSORA today announced production launch of Jotunn8, Inference chip.
VSORA, a French semiconductor innovator, today announced the production launch of Jotunn8, an artificial intelligence (AI) inference chip engineered to power next-generation data center applications delivering breakthrough efficiency and scalability for large-scale artificial intelligence deployment.
VSORA Launches Europe’s Most Powerful AI Inference Chip - Semiconductor Digest
Copilot: No, VSORA has not publicly disclosed the node size for the Jotunn8 AI inference chip. Their press materials emphasize performance, efficiency, and scalability, but omit specific process technology details.
Here’s what we do know:
What’s Confirmed About Jotunn8
- Purpose: Designed for AI inference, not training—optimized for real-time workloads like chatbots, fraud detection, and LLM APIs.
- Performance: Delivers up to 3,200 teraflops at over 50% sustained utilization, while consuming about half the power of comparable inference chips.
- Deployment: Targeted for data center-scale rollout, with emphasis on cost-efficiency and sustainability.
- Manufacturing: VSORA is partnering with global foundries for production, but has not named the fab or node.
What’s Missing
- No mention of process node: There’s no reference to 5nm, 3nm, or any specific technology node in their public materials.
- No fab named: VSORA has not confirmed whether the chip is being fabbed by TSMC, Samsung, Intel, or another foundry.
- No packaging details: There’s no info on chiplet architecture, HBM integration, or advanced packaging like CoWoS or I-Cube.
Speculative Context
- Given the power efficiency and throughput claims, it’s plausible Jotunn8 is built on a sub-7nm node, possibly 5nm or 4nm-class.
- If VSORA is targeting volume deployment in 2025–2026, they may be using TSMC N5/N4, Samsung SF4, or Intel 4/3—but this remains unconfirmed.
If you want to dig into VSORA’s architecture claims, like their “near-theory performance” or how they handle agentic AI workloads, I can help dissect that next.
In the AI inference space—especially for data center deployment in 2025 and beyond—yes, anything larger than 4nm is generally noncompetitive unless offset by architectural innovation or niche targeting. Here's why:
Why Sub-4nm Is the Competitive Baseline
- Power efficiency: AI inference workloads are power-bound. Nodes like 3nm (N3B, SF3) and 4nm (N4P, SF4X) offer better dynamic and leakage power profiles, which directly impact TCO.
- Density and cost-per-token: LLM inference benefits from high SRAM density and fast interconnects. Larger nodes (e.g., 7nm or 10nm) require bigger dies or more chiplets, raising cost and latency.
- Packaging synergy: Sub-4nm nodes pair better with advanced packaging (CoWoS, I-Cube, Foveros), enabling tighter integration with HBM stacks and accelerators.
Exceptions That Prove the Rule
- Architectural outliers: Some chips (like Cerebras WSE or Groq’s tensor streaming) use older nodes but win on parallelism and memory bandwidth.
- Edge inference: For low-power edge devices, 7nm or even 12nm may still be viable due to cost and thermal constraints.
- Startup constraints: New entrants like VSORA may prototype on mature nodes (e.g., 7nm) for cost reasons, then migrate to 4nm or below for volume.
VSORA’s Case
- Their Jotunn8 claims 3,200 TFLOPs with half the power of competitors. If true, that implies sub-5nm node, likely TSMC N4P or Samsung SF4X.
- Anything larger would struggle to meet those efficiency claims unless they’ve built a radically novel architecture.
Bottom line: 4nm is the floor for competitive AI inference at scale, unless you're doing something truly unconventional. Want to riff on how SF4X compares to N4P in SRAM density or how VSORA might be architecting around memory bottlenecks? Let’s dig in. |