🤖 AI Summary
An international study led by the BBC and the European Broadcasting Union evaluated how four widely used AI assistants — ChatGPT, Microsoft Copilot, Google Gemini and Perplexity — answer news questions. Twenty-two public service media organisations across 18 countries and 14 languages asked 30 core (and additional local) questions and had journalists rate free/consumer-version responses (generated May–June 2025) on five criteria: accuracy (including direct quotes), sourcing, opinion vs fact, editorialisation and context. While there has been some improvement since the BBC’s earlier round (the BBC-to-BBC share of responses with significant issues fell from 51% to 37%), the multi-market audit found errors remain widespread: 45% of responses contained at least one significant issue. Sourcing was the largest problem (31% of significant issues), with Gemini especially poor on attribution (72% of its responses had major sourcing errors) while Copilot, ChatGPT and Perplexity each had roughly a third of responses with significant faults.
Technically, the study shows problems are systemic across languages and markets and include misattribution, overconfidence, editorialisation and insufficient context — all of which can mislead users and damage publisher trust as AI “answer-first” experiences divert traffic (publishers report search referral drops of 25–30%). The report releases a “News Integrity in AI Assistants Toolkit” and urges developers to prioritise accuracy and transparent reporting by language/market, give publishers control and clear citation formats, and for policymakers to consider accountability measures. It also calls for continued media literacy work so audiences understand assistants’ limits.
Loading comments...
login to comment
loading comments...
no comments yet