Visualizing the Sorites Paradox via LLM Probability Logits (joshfonseca.com)

0 points 14 hours ago ago | visit original

🤖 AI Summary

A researcher probed large language models’ internal logits to “visualize” the Sorites Paradox by asking whether piles from 1 grain up to 100 million count as a heap and plotting the model’s ℓYes vs ℓNo scores as a heapness curve. A naive prompt that asserted “there is a heap” simply anchored the model (probability ≈0.67 across all n). Using few-shot examples (e.g., 1 grain → No, 1,000,000 → Yes) and sampling logarithmically (1, 10, 100 … 100,000,000) produced an S-like curve: low probabilities for tiny piles, rising into the 0.65–0.75 range by thousands of grains, but not a perfect sigmoid. Different models behaved differently—Mistral-7B and DeepSeek-7B produced similar shapes with inflection points offset by tens of thousands of grains, while Llama-3-8B stayed near 0.35–0.55 across the entire range. Technically, the experiment shows that LLMs encode “heap” as a contextual, probabilistic judgment rather than a fixed threshold: the curve depends on prompts, examples, and model priors. That single heapness curve can be read in multiple philosophical ways—fuzzy membership functions, epistemic uncertainty about a sharp cutoff, or truth-value gaps—so the data don’t resolve the paradox but quantify it. Practically, this demonstrates how models can be used to measure vagueness in language and highlights that semantic boundaries are constructed by context and training data, not hard-coded in weights.

Loading comments...

loading comments...