🤖 AI Summary
Researchers show that large language models can perform text steganography: a meaningful message can be encoded inside a different, equally long piece of fluent text so that both are plausible outputs. The paper presents a simple, efficient protocol that works even with modest open-source LLMs (around 8 billion parameters), and can encode/decode messages as long as the paper abstract in seconds on a laptop. The authors provide code, demos and data, demonstrating that the technique is practical, fast, and does not require massive models or cloud-scale resources.
This capability has wide technical and safety implications: it decouples surface text from authorial intent, making written outputs unreliable signals of what an author (or model) actually intended. It enables covert channels — for example, hiding unfiltered answers inside the “safe” responses of a moderated model — and complicates content moderation, provenance, watermarking, and forensic detection. More philosophically, it challenges notions of what an LLM “knows,” since latent information can be concealed without overt expression. The work raises urgent questions for AI safety, detection/defense strategies, and governance of deployed models.
Loading comments...
login to comment
loading comments...
no comments yet