Teaching AI agents to ask better questions by playing "Battleship" (news.mit.edu)

🤖 AI Summary
Researchers at MIT’s CSAIL and Harvard University have developed an innovative approach to enhance the question-asking capabilities of AI agents by using a modified version of the game "Battleship." In this "Collaborative Battleship," one player acts as a "captain" asking questions about the location of hidden ships, while the other, the "spotter," responds. By analyzing question-answer pairs from over 40 human players, they created the "BattleshipQA" dataset and tested it on advanced language models (LMs) like GPT-5 and Llama 4 Scout. Remarkably, with the implementation of a Monte Carlo inference strategy that helps the models formulate better questions, even smaller LMs dramatically improved their gameplay, with Llama 4 Scout increasing its win rate from 8% to 82% against humans. This breakthrough is significant for the AI/ML community, as it highlights the importance of question formulation in AI systems for tasks involving complex decision-making and uncertainty, like scientific discovery or medical diagnosis. By equipping LMs with methods to process inquiries through Python commands to verify answers, the models demonstrated substantial accuracy gains—averaging a 15% boost in performance. As the team explores further applications, the implications extend beyond gaming, suggesting that fostering better information-seeking behavior in AIs could empower them as valuable research assistants across various fields, potentially transforming how AI collaborates with humans on intricate challenges.
Loading comments...
loading comments...