AI models can do useful work (www.theregister.com)

🤖 AI Summary
Researchers at UC Berkeley used OpenEvolve (an open-source reimplementation of DeepMind’s AlphaEvolve) to let AI iteratively generate, evaluate and refine system-level code and discovered a far faster load‑balancing method for Expert Parallelism Load Balancer (EPLB) used in LLM inference. Starting from DeepSeek’s Python baseline that used a linear for‑loop (≈540 ms), OpenEvolve—running a mix of Gemini 2.5 Flash and Flash Lite models (80/20) in under five hours for under $10—rewrote the hot path with vectorized tensor operations and a zig‑zag partitioning scheme to cut runtime to 3.7 ms. That’s a 146× speedup vs DeepSeek and 5× faster than a non‑public reference implementation (19.6 ms). They also report a separate 3× speedup in relational analytics that invoke LLMs per row. The work showcases “AI‑Driven Research for Systems” (ADRS): AI as an algorithm-discovery engine that can search vast literature and design spaces to yield novel or unexpected optimizations. Technical implications include automated replacement of scalar loops with hardware-friendly tensorized code and new partitioning heuristics tailored to GPU memory/throughput. Broader significance: systems research and industry tooling may shift toward AI-guided optimization, with humans focusing on problem formulation, validation criteria and safety. The authors stress the current bottleneck is robust evaluation/verification—without that, automated algorithm discovery risks brittle or unsafe deployments despite clear performance gains.
Loading comments...
loading comments...