Memory AI Server Aims to Shatter the Memory Wall (spectrum.ieee.org)

0 points 2 hours ago ago | visit original

🤖 AI Summary

Majestic Labs has announced the development of Prometheus, an innovative AI server designed to tackle the "memory wall" constraint that limits the performance of large language models (LLMs). By offering up to 128 terabytes of memory—over 60 times that of Nvidia’s top server—the Prometheus server aims to significantly enhance LLM inference speeds. The new architecture leverages a DRAM-centric design, employing LPDDR6 memory and a proprietary memory interface that utilizes miniature copper cables for greater distance efficiency, allowing for a memory bandwidth of up to 25.6 terabytes per second. In addition to its substantial memory capabilities, Prometheus includes the Ignite AI processing unit, which integrates ARM application cores with RISC-V vector and tensor cores on one die, facilitating seamless computation for LLMs. The server's design is modular and Open Compute Project-compliant, ensuring future-proof upgrades and greater energy efficiency, with a projected reduction in capital expenditures for customers. Prometheus is set to ship in 2027, but Majestic Labs emphasizes that their solution will make AI hardware more economical while supporting popular frameworks like PyTorch and Triton without requiring code modifications.

Loading comments...

loading comments...