Leanstral 1.5: Proof Abundance for All (mistral.ai)

0 points 2 hours ago ago | visit original

🤖 AI Summary

Leanstral 1.5 has been launched, introducing a significant upgrade for formal verification in the AI/ML community. This free, open-source model, licensed under Apache-2.0 and featuring 6 billion active parameters, demonstrates exceptional performance by saturating the miniF2F benchmark, solving 587 out of 672 PutnamBench problems, and achieving state-of-the-art scores of 87% on FATE-H and 34% on FATE-X. The model underwent a robust three-stage training process involving mid-training, supervised fine-tuning, and reinforcement learning, enhancing its capabilities in proof engineering and real-world code verification. Notably, it discovered five previously unknown bugs across 57 tested repositories, underscoring its practical application. The advancements in Leanstral 1.5 are particularly noteworthy as they showcase the model's ability to tackle complex proof challenges in real-world scenarios, proving time complexities for data structures like AVL trees and catching hidden flaws in code. The high performance at significantly lower operational costs—about $4 per problem compared to competitors' hundreds of dollars—paves the way for broader adoption of formal verification methods. With its full open-source availability on platforms like Hugging Face and as a free API, Leanstral 1.5 offers both AI researchers and developers a powerful tool for enhancing proof engineering in Lean 4, signaling a hopeful direction for practical AI applications in formal mathematics and software reliability.

Loading comments...

loading comments...