AIs like ChatGPT fall apart in the classic 'Stroop' psychological test (www.techradar.com)

0 points 3 hours ago ago | visit original

🤖 AI Summary

A recent study published in PNAS Nexus highlights significant limitations in well-known AI models, including GPT-4o and Claude 3.5 Sonnet, when subjected to the classic Stroop psychological test. This test evaluates cognitive interference by challenging individuals to name the ink color of a word that may itself denote a different color. While humans maintain high accuracy, these AI models exhibited a steep decline in performance as word list length increased, illustrating their struggles with executive attention control crucial for complex decision-making tasks. Notably, GPT-4o's accuracy plummeted from 91% on short lists to just 15% on longer ones, while Claude 3.5 Sonnet showed slightly better performance yet also dropped significantly. The findings suggest that despite advancements, these transformer-based models suffer from inherent architectural limitations that undermine their ability to manage cognitive flexibility effectively, a key component in achieving artificial general intelligence (AGI). Researchers argue that future AI development should focus on integrating more sophisticated executive control systems, akin to those found in human cognition, rather than relying solely on enhancing memory capabilities. This study serves as a reminder of the challenges facing AI in mimicking human-like intelligence and the importance of addressing underlying cognitive mechanisms for meaningful progress towards AGI.

Loading comments...

loading comments...