🤖 AI Summary
Microsoft CTO Kevin Scott said the company intends to migrate the bulk of its AI workloads away from AMD and Nvidia GPUs toward its own Maia accelerators — a shift driven primarily by performance-per-dollar economics and the freedom of end-to-end system design (compute, networking, cooling). Microsoft’s first Maia chip already handled production work by offloading GPT‑3.5 and freeing GPU capacity, and Scott signaled that the plan is to make “mainly Microsoft silicon” the default in datacenters if the next-generation Maia meets expectations.
Technically, the maiden Maia 100 delivered about 800 teraFLOPS BF16, 64 GB HBM2e and ~1.8 TB/s memory bandwidth, which lagged leading Nvidia/AMD GPUs. Microsoft is targeting stronger compute, memory, and interconnect in a second-gen Maia due next year. Even so, complete replacement is unlikely: large customers and third-party workloads still demand Nvidia/AMD GPUs, and peers like Google and Amazon use custom TPUs/Trainium primarily for in-house services while still offering GPUs at scale. Microsoft is also developing other silicon (Cobalt CPU and security accelerators), underscoring a broader push toward vertical hardware/software co-design to lower costs and optimize AI workloads.
Loading comments...
login to comment
loading comments...
no comments yet