🤖 AI Summary
Researchers evaluated several state-of-the-art "reasoning" LLMs (o1-mini, o3-mini, DeepSeek‑R1, Claude 3.7 Sonnet, Gemini 2.5 Pro Preview, Grok 3 Mini Beta) on graph-coloring—a variable-complexity constraint‑satisfaction logic task—and found a recurring, systematic failure mode: models hallucinate critical problem features, notably graph edges that were not present in the prompt. By comparing error rates across complexity levels and analyzing chain‑of‑thought (CoT) explanations, the authors show these input‑conflicting hallucinations account for a large fraction of incorrect answers (and the majority of errors for some models). They validated generality with smaller experiments on stable‑matching instances, indicating this is not task‑specific but a broader misrepresentation of problem specifics despite CoT training via reinforcement learning.
This finding is significant because it undermines a core assumption about RLLMs: that improved CoT and RL training reliably anchor models to explicit problem constraints. Practically, it means LLMs can invent or omit constraints during reasoning, breaking correctness guarantees in formal tasks and pipelines that expect faithful problem representation. Technical implications include the need for tighter input grounding, verification layers, constrained decoding, symbolic integration (e.g., calling solvers), and benchmark designs that detect input‑conflicting hallucinations. The paper offers design suggestions to mitigate this weakness, underscoring that better chain‑of‑thought alone may not solve misrepresentation errors.
Loading comments...
login to comment
loading comments...
no comments yet