Darwin Family: MRI-Trust-Weighted Evolutionary Merging (arxiv.org)

0 points 1 hour ago ago | visit original

🤖 AI Summary

The newly introduced Darwin Family framework revolutionizes the evolutionary merging of large language models by enabling training-free scaling through gradient-free weight-space recombination. This innovative approach raises the possibility of enhancing frontier-level reasoning performance without additional training by effectively reorganizing pre-existing capabilities from model checkpoints. Key features of this framework include a 14-dimensional adaptive merge genome for detailed component-level recombination, MRI-Trust Fusion for adaptive balancing of layer-importance signals, and an Architecture Mapper that facilitates cross-architecture breeding among different model families. The significance of the Darwin Family lies in its empirical success, with the flagship Darwin-27B-Opus model achieving an impressive 86.9% on the GPQA Diamond benchmark, placing it sixth among over 1,250 models evaluated. Notably, this model surpasses even its fully trained counterparts without any gradient-based training, highlighting the efficiency of the Darwin framework. By allowing recursive multi-generation evolution and integrating diverse model components, the Darwin Family presents a scalable, cost-effective alternative to traditional post-training pipelines, paving the way for more accessible and advanced reasoning-centric language models in the AI/ML community.

Loading comments...

loading comments...