Wikipedia urges AI companies to use its paid API, and stop scraping (techcrunch.com)

0 points 254 days ago ago | visit original

🤖 AI Summary

The Wikimedia Foundation announced that AI developers should stop scraping Wikipedia and instead access its content through Wikimedia Enterprise — an opt-in, paid API designed to deliver Wikipedia at scale without overloading public servers and to financially support the nonprofit. The post asks generative-AI providers to attribute Wikipedia content, link back to sources, and make it easy for users to visit and contribute. Wikimedia said it recently upgraded bot-detection after discovering AI bots in May–June that attempted to evade detection; at the same time “human page views” fell about 8% year‑over‑year. The foundation stops short of legal threats but frames the move as essential to keeping volunteer contributions and donations flowing. For the AI/ML community this is a practical and reputational pivot: models that rely on Wikipedia should adopt licensed ingestion pipelines, honor attribution and provenance, and budget for access costs rather than depend on fragile scraping. Technically, providers will need to integrate Enterprise feeds or mirror agreements, update training/data-collection workflows to respect rate limits and terms, and implement clearer source-tracking in outputs. The shift highlights growing industry expectations around data licensing, infrastructure impact, and traceability — factors that influence model training, compliance, and user trust going forward.

Loading comments...

loading comments...