Humans and LLMs represent sentences similarly, study finds (techxplore.com)

🤖 AI Summary
Researchers at Zhejiang University report in Nature Human Behaviour that humans and large language models (LLMs) represent sentences in remarkably similar, tree-like ways. Using a one-shot word-deletion task, the team tested 372 participants (native Chinese, native English, and bilingual speakers) and ChatGPT: after seeing a single demonstration of which words to remove, subjects had to infer the rule and delete a span from a test sentence. Both humans and the LLM preferentially removed full syntactic constituents (coherent grammatical units) rather than arbitrary word strings, and the deleted spans followed language-specific rules for Chinese vs. English. Technically significant, the pattern of deletions allowed reconstruction of underlying constituency trees, a result the authors argue cannot be explained by models that rely only on word-level properties or linear positions. That implies LLM internal representations reflect latent, tree-structured syntax similar to human mental representations. The finding strengthens the case for using LLMs as tools and models in psycholinguistics, informs interpretability efforts (showing emergent syntactic structure), and suggests future work can probe when and how such representations are used across tasks, languages, and model architectures.
Loading comments...
loading comments...