🤖 AI Summary
A new deep-dive report outlines why HBM (High-Bandwidth Memory) is the linchpin of modern AI accelerators and maps the technical and supply-chain shifts that will determine whether HBM can scale to meet exploding GenAI demand. It highlights a coming inflection with HBM4 — notably custom base dies used by major accelerator teams (Nvidia, OpenAI, AMD) — and shows how HBM’s unique blend of wide I/O, vertical stacking and proximity to compute makes it the preferred, albeit expensive, memory for high-capacity, low-latency AI workloads. The report also quantifies rising bit demand (Nvidia alone driving massive capacity growth), and flags that one technological change could nonetheless slow the trend of ever-larger HBM stacks.
Technically the piece explains why HBM is hard: TSV creation, double-sided bumping, silicon interposers/CoWoS, and thermal/power delivery up multi-die stacks drive complexity and lower front-end/back-end yields. SK Hynix’s HBM3E “all-around” power TSVs reduce IR drop, Micron claims major power wins via TSV/PDN design, and Hynix’s MR‑MUF packaging trades improved thermal and throughput against alternatives like TCB/NCF. Layer-yield math (x%^n) makes >8–12 layers exponentially harder, and bonding tools/field service (Hanmi vs Hanwha/ASMPT dispute) pose immediate supply risk. The combined technical and tooling bottlenecks — plus Samsung’s weaker yields — mean tight supply, high prices, and strategic vendor leverage that will shape accelerator roadmaps, vendor sourcing, and future memory architecture choices (e.g., compute-under-memory, SRAM tags, repeater PHYs, LPDDR+HBM combos).
Loading comments...
login to comment
loading comments...
no comments yet