The Silence of the LLaMbs: Getting LLMs to Shut Up (ossa-ma.github.io)

0 points 27 days ago ago | visit original

🤖 AI Summary

A recent exploration into the behavior of large language models (LLMs) revealed a fundamental oversight: they have never been trained to refrain from responding. This significant finding highlights that the inability of models to generate silence is not due to complex conflicts in training methodologies like reinforcement learning from human feedback (RLHF), but rather a lack of exposure to scenarios where non-response is expected. Experiments demonstrated how LLMs overwhelmingly default to producing outputs even when prompted for silence, showcasing high probabilities of continuing dialogue instead. The research also showcased the effectiveness of fine-tuning LLMs with minimal examples. By incorporating just ten silence-inducing prompts into the training data, LLMs experienced a dramatic increase in their ability to recognize and produce empty outputs, boosting the probability of silence responses to nearly 100%. This transformation underscores the broader implications for the AI/ML community in understanding the limitations of LLMs and the importance of training data diversity. By teaching models about silence, researchers can expand their response capabilities, reaffirming that behaviors absent from training distributions cannot be easily enforced without appropriate training interventions.

Loading comments...

loading comments...