🤖 AI Summary
A veteran ML engineer recounts coming up through AlexNet-era deep learning, building Jetpac’s scalable inference stack, and later leading mobile TensorFlow work to make the case that the industry is in an “AI bubble” driven by exuberant GPU and data-center spending rather than investment in software efficiency. He shares a practical gripe: his efficiency-focused startup (Moonshine) has struggled to raise funds despite clear ROI, even as firms and VCs prize GPU counts as competitive signaling. He warns that this allocation is irrational given both cost and environmental stakes.
Technically, the piece highlights concrete opportunity: GPU utilization is often below 50%—worse for interactive, small-batch, memory-bound workloads—so well-engineered software and full-stack optimization (hardware, OS, model architecture, runtime) can deliver large gains. Engineers have outperformed vendor libraries on the same chips, and inference can often be shifted to much cheaper CPUs or mobile devices with modest model changes. The author likens Nvidia’s dominance to Sun in the dot-com era and predicts market correction: cheaper PCs plus open-source models and smarter software could undercut hardware-centric moats, yielding big cost, energy, and scalability benefits for the ML community.
Loading comments...
login to comment
loading comments...
no comments yet