“Rewriting the blueprint, not removing bricks”: Multiverse Computing says it can shrink large AI models and cut memory use in half (www.techradar.com)

🤖 AI Summary
Spanish AI company Multiverse Computing has announced the release of HyperNova 60B 2602, a compressed version of OpenAI’s gpt-oss-120B, which significantly reduces memory requirements from 61GB to 32GB while maintaining near-parity performance. This is significant for the AI/ML community as it enables large language models (LLMs) to function on less powerful hardware, providing a cost-effective solution for developers facing budgetary and energy constraints. The model is available for free on Hugging Face, broadening accessibility for research and practical applications. Multiverse employs its proprietary CompactifAI technology, which restructures the internal weight matrices of transformer models using quantum-inspired tensor networks, ensuring up to 93% reduction in memory and lower parameter counts without retraining the original model. This technique not only preserves accuracy—reporting only a 2-3% loss even with substantial compression—but also enhances agent-focused benchmarks, significantly improving tool use and coding workflows. As LLMs evolve, this compression could lead to smaller, more efficient AI models being deployable across diverse environments, from cloud servers to edge devices.
Loading comments...
loading comments...