Qwen3.7-Max Ran for 35 Hours on Unknown Hardware and Achieved a 10× Speedup (firethering.com)

0 points 1 hour ago ago | visit original

🤖 AI Summary

Alibaba's Qwen3.7-Max model recently achieved a remarkable 10× speedup in kernel optimization on an unfamiliar hardware platform, the T-Head ZW-M890 PPUs, over a 35-hour autonomous run. Tasked with optimizing the Extend Attention kernel from SGLang, the model successfully executed 1,158 tool calls without prior knowledge of the architecture, showcasing its capacity to diagnose issues, make iterative improvements, and redesign components in real-time. In contrast, competing models like GLM 5.1 and Kimi K2.6 reached lower speedups of 7.3x and 5x, respectively, demonstrating the unique capabilities of Qwen3.7-Max in long-term problem-solving without human oversight. This achievement is significant for the AI/ML community as it highlights a novel approach to model training—environment scaling—where diverse training environments enhance a model's adaptability and performance. Unlike traditional models that improve primarily through exposure to text, Qwen3.7-Max excels due to its ability to perform in various conditions, reinforcing its cross-harness generalization ability. However, it currently lacks open weights and may not meet the needs of teams requiring data privacy or self-hosting, limiting its accessibility despite its impressive benchmark performance in core tasks.

Loading comments...

loading comments...