Paris: First Decentralized Trained Open-Weight Diffusion Model (blog.bagel.com)

0 points 5 hours ago ago | visit original

🤖 AI Summary

Bagel’s team today released Paris, billed as the world’s first decentralized-trained, open-weight diffusion model (MIT license). Rather than training a single huge model, Paris is built by combining many smaller “expert” diffusion models that were each trained from scratch in isolation across different continents — with no gradient, parameter, or intermediate-activation synchronization during training. The project claims this zero-communication protocol produces image quality comparable to state-of-the-art distributed training while using roughly 14× less data and 16× less compute. The authors publish model weights and a full technical report for researchers and commercial users. For the AI/ML community this is notable because it demonstrates a practical route to large-model performance without conventional synchronous distributed training or heavy infrastructure, which could lower barriers to entry, enable more privacy-preserving or geographically-distributed workflows, and accelerate open research. Key technical implications include the viability of post-hoc aggregation of independently trained diffusion experts, dramatic resource-efficiency claims, and a path toward scaling by resolving new systems and algorithmic challenges the authors highlight. The release invites researchers and engineers to reproduce, extend, and help scale the approach toward global state-of-the-art results.

Loading comments...

loading comments...