Task-free intelligence testing of LLMs (www.marble.onl)

🤖 AI Summary
Recent experiments have introduced a novel approach to evaluating large language models (LLMs) by employing "task-free" intelligence testing. Instead of conventional modes that assess models based on their ability to complete specific tasks, these tests explore LLM responses to a series of non-task stimuli, specifically the word "tap" presented in various numerical patterns (e.g., Fibonacci, primes, or even numbers). The focus is on observing how models engage with the stimuli, revealing their natural interactions and investigation tendencies, which may better reflect underlying cognitive traits. The findings suggest significant behavioral differences across LLMs in response to this unconventional evaluation. Many models, including Claude and Gemini, exhibited playful engagement and a tendency to generate creative responses, while OpenAI’s GPT 5.2 remained mechanical and less interactive. This variance reveals insights into how different architectures handle abstract stimuli and explore understanding. By emphasizing exploration over performance-based metrics, this method provides a fresh lens through which to assess LLM intelligence, moving the discourse beyond mere functional capabilities to a deeper understanding of their cognitive processes.
Loading comments...
loading comments...