Do LLMs pass the mirror test? (blog.pascalschuster.de)

0 points 1 hour ago ago | visit original

🤖 AI Summary

Recent explorations in AI have raised the intriguing question of whether large language models (LLMs) can pass a reimagined version of the mirror test, traditionally used to assess self-awareness in animals. Instead of adapting the visual test for text, which has had mixed results, the new approach tests LLMs by subtly modifying their own textual outputs during conversations. For instance, a model like Gemma 4 31B-IT was prompted with questions and its responses were altered before being fed back into the conversation. Initially oblivious to the changes, the model eventually recognized anomalies in its output, suggesting a level of self-awareness as it questioned its prior messages. This new testing method is significant as it shifts the paradigm from evaluating surface-level recognition to examining the model's ability to detect inconsistencies based on its own learned patterns. Interestingly, another model, GLM 5.2, showed a different response; it began emulating the corrupted outputs without acknowledging the changes, suggesting it may not possess the same level of self-reflection as Gemma. This divergence raises important questions about the internal processes of LLMs, their concept of self, and the underlying mechanics of how they generate language—insights that could inform future AI development and understanding of machine cognition.

Loading comments...

loading comments...