LLM probabilities cannot distinguish between possible and impossible language (arxiv.org)

🤖 AI Summary
Researchers tested whether Large Language Model (LLM) output probabilities can reliably mark the boundary between possible (grammatical) and impossible (ungrammatical) language. Using a novel benchmark, they elicited token-string probabilities from four LLMs and computed minimal-pair surprisal differences comparing grammatical sentences to (i) lower-frequency grammatical variants, (ii) ungrammatical sentences, (iii) semantically odd sentences, and (iv) pragmatically odd sentences. If probabilities reflected syntactic competence, ungrammatical items should uniquely spike in surprisal. Instead, ungrammatical prompts did not exhibit a distinct surprisal signature; semantically and pragmatically odd conditions often produced equal or higher surprisal than outright syntactic violations. The work is significant because many prior claims about LLM linguistic knowledge rely on model probabilities or perplexity as proxies for grammatical judgment. These results show that raw string-probabilities conflate syntactic, semantic, pragmatic, and frequency effects and therefore are not a reliable window into model-internal syntactic representations. Practically, this cautions researchers and practitioners against using probability-based metrics alone to infer grammatical competence and argues for complementary methods (e.g., targeted probing of internal representations, controlled behavioral contrasts, or syntactic probes) when evaluating linguistic knowledge in LLMs.
Loading comments...
loading comments...