Stable Diffusion 3.5 Flash (hmrishavbandy.github.io)

🤖 AI Summary
Stable Diffusion 3.5 Flash (SD3.5-Flash) is a new few-step distillation framework that brings high-quality rectified flow generation to consumer hardware. Rectified flow models usually need many refinement steps and heavy compute; SD3.5-Flash adapts distribution-matching distillation to a low-step regime without quality loss. It introduces "timestep sharing"—computing distribution objectives from intermediate trajectory samples instead of re-noised data—to avoid the unstable gradients that typically degrade few-step training. A "split-timestep" fine-tuning strategy temporarily expands model capacity with specialized timestep branches during training to resolve the usual capacity–quality tradeoff. Combined with system optimizations (text-encoder restructuring, intelligent quantization and pipeline tweaks), SD3.5-Flash produces high-resolution images in under one second on devices with ~8 GB memory and runs in real time on-phone (demo on iPhone A17 at 512px). Extensive evaluations—including user studies with 124 annotators over 507 prompts and 4 seeds, plus ELO-style human rankings—show consistent user preference over prior few-step methods while maintaining teacher-model quality. The result is a practical path to deploy advanced generative flows on GPUs and mobile devices with comparable latency and memory footprints to lighter models.
Loading comments...
loading comments...