🤖 AI Summary
Researchers introduce Generative Adversarial Distillation (GAD), a black-box, on-policy method for creating student LLMs from proprietary teachers using only the teachers’ text outputs (no logits or parameters). GAD treats the student as a generator and trains a discriminator to tell student responses apart from teacher responses, creating a minimax game where the discriminator serves as a continuously updated, on-policy reward model. Because the reward model co-evolves with the student, GAD supplies adaptive, stable feedback that mitigates distributional mismatch problems common in offline sequence-level knowledge distillation.
Empirically, GAD consistently outperforms standard sequence-level distillation and can close large gaps: Qwen2.5-14B-Instruct trained with GAD became comparable to its teacher, GPT-5-Chat, on the LMSYS-Chat automatic evaluation. The approach is significant for the AI/ML community because it enables effective model compression and capability transfer from closed-source teacher models without internal access, offering a practical pathway to replicate behavior from proprietary LLMs. Technical implications include a shift toward adversarially trained reward models for on-policy distillation, improved robustness to covariate shift, and renewed discussion about model extraction risks and IP considerations when only text outputs are exposed.
Loading comments...
login to comment
loading comments...
no comments yet