AI-written web pages haven't overwhelmed human-authored content, yet (www.axios.com)

🤖 AI Summary
Graphite analyzed a random sample of 65,000 Common Crawl URLs dated 2020–May 2025 and found that AI-written web pages rose sharply after ChatGPT’s 2023 debut. Using the Surfer detector (classifying pages as AI-generated when ≤50% of content was identified as human-written), AI output briefly exceeded human-authored articles in November 2024 but has stayed roughly even since. Graphite validated Surfer on their own GPT-4o content and pre‑ChatGPT pages, reporting a 4.2% false positive rate and a 0.6% false negative rate. Parallel analysis shows search engines and chatbots still favor human content: 86% of Google-ranked articles and 82% of sources cited by ChatGPT and Perplexity were human-authored, with AI pages generally ranking lower. The study matters because Common Crawl is a major LLM training-data source, so shifts in web composition could influence model training and content-farm strategies. However, results are constrained by detection limitations, mixed human–AI workflows that blur authorship, and dataset bias (paywalled, likely human-written sites often block Common Crawl). That suggests AI’s footprint on the public web is growing but not dominant, and search/chat systems currently deprioritize purely auto-generated material. User sentiment backs this up: only 20% find AI search summaries very useful and 6% trust them a lot, implying human-authored content still carries authority and SEO value.
Loading comments...
loading comments...