Show HN: Tiny Diffusion – A character-level text diffusion model from scratch (github.com)

0 points 17 hours ago ago | visit original

🤖 AI Summary

Tiny Diffusion is a compact, character-level text-generation diffusion model implemented from scratch (a modified nanochat GPT-style transformer) and trained on the Tiny Shakespeare corpus. The repo ships pre-trained weights (weights/diffusion_model.pt) so you can run sampling and visualizations locally with Python 3.10+. The model is intentionally small—10.7M parameters, 6 transformer layers, 6 attention heads, 384-dim embeddings, 256-character context length—and uses 128 diffusion denoising steps. The author reports training ~20k steps in ~30 minutes on 4×A100 GPUs; sampling, checkpointing, and animation scripts (including a denoising visualization and a Game-of-Life-inspired sampler) are provided for easy experimentation. This project is significant because it demonstrates diffusion-style generation applied to discrete text at a tiny scale, offering a hands-on platform to explore non-autoregressive sampling, denoising dynamics, and alternative sampling strategies without large compute budgets. It’s useful for researchers and educators wanting to inspect the stepwise denoising process, prototype controllable or hybrid models, or study sampling behavior in tokenized/character spaces. Limitations are clear: trained on a tiny dataset (Tiny Shakespeare) so it’s not general-purpose, and diffusion-based text generation still involves tradeoffs in sampling steps and sequence length. Still, Tiny Diffusion is a practical, reproducible sandbox for experimenting with diffusion paradigms in NLP.

Loading comments...

loading comments...