Truth-Aware Decoding: Program Logic for Factual LMs (arxiv.org)

0 points 9 hours ago ago | visit original

🤖 AI Summary

Researchers introduced Truth-Aware Decoding (TAD), a verification-oriented decoding framework that enforces factuality by weaving a lattice of semantic “guards” into generation time. Framed in probabilistic program semantics, TAD treats oracle-style filtering as a program-logic judgment and provides formal machinery—including a multi-agent operational calculus and machine-checked Lean artefacts—to certify that decoder behavior respects those guards. The paper proves that, when guards are sound and complete, greedy selection preserves a local likelihood dominance property (Theorem 2.7), and it derives an entropy-style invariant that measures factual risk via a knowledge-aware “safe mass.” This is significant because it gives the AI/ML community a principled, provable bridge between large empirical LMs and formal verification: decode-time constraints can be expressed, reasoned about, and mechanically verified without re-training. Practically, TAD integrates with instruction-tuned models, lets knowledge bases constrain token choices, quantifies residual factual uncertainty, and preserves throughput. Numerical and algorithmic case studies reported fewer hallucinations with negligible performance cost, showing the approach is both pragmatic and theoretically grounded. Implications include safer retrieval-augmented generation, clearer guarantees about greedy vs. other decoding strategies under constraints, and a path toward certified, knowledge-aware generation in deployed systems.

Loading comments...

loading comments...