LFM2-24B-A2B: Scaling Up the LFM2 Architecture (www.liquid.ai)

🤖 AI Summary
Liquid AI has announced the early release of LFM2-24B-A2B, the latest and largest iteration of its LFM2 model architecture, featuring a staggering 24 billion parameters with a unique sparse Mixture of Experts (MoE) design that activates only 2 billion parameters per token. This model builds on the existing LFM2 family, which ranges from 350 million to 24 billion parameters, demonstrating significant quality improvements across benchmarks as it scales. Designed to function efficiently within a 32GB RAM constraint, LFM2-24B-A2B is deployable on a variety of systems, including consumer-grade laptops and desktops, making it accessible for a broad range of applications. Technically, LFM2-24B-A2B enhances performance through a hybrid structure that combines gated short convolution blocks with a limited number of grouped query attention blocks, achieving faster inference with lower memory costs. The model scales by increasing depth and expert count while keeping the per-token compute budget manageable. Initial benchmarks reveal that LFM2-24B-A2B outperforms comparable models in terms of throughput during inference, highlighting the efficiency of its design. Future improvements are anticipated with the completion of its pre-training phase, set to enhance its capabilities even further. Users can download the model now to explore its potential across various workloads.
Loading comments...
loading comments...