Grokipedia cites a Nazi forum and fringe conspiracy websites (indicator.media)

🤖 AI Summary
Researchers who scraped 99.8% of Grokipedia between Oct 28–30 found the AI-powered encyclopedia launched by Elon Musk frequently cites fringe and blacklisted domains. Entries such as “Clinton body count” reference InfoWars (34 citations), and other problematic sources appear widely: LifeSiteNews (100), Stormfront (42), Global Research (51) and VoltaireNet (45). Across Grokipedia’s ~890,000 entries, domains Wikipedia editors mark “generally unreliable” or “blacklisted” appear 2.6 million times — about 6.0% of all citations, roughly double their share on English Wikipedia. The analysis, based on domain ratings compiled by Hause Lin et al. (2023), will be detailed in an upcoming arXiv preprint by privacy/security researcher Hal Triedman and colleagues. For the AI/ML community this is a concrete case study of how dataset curation and automated sourcing shape downstream knowledge systems. Grokipedia reproduces Wikipedia verbatim for over half its pages, but selectively rewrites a smaller set—often on sensitive topics—to incorporate lower-quality sources, amplifying particular narratives. That pattern underscores key technical challenges: provenance tracking, source reliability filtering, editorial policy encoding in generative pipelines, and the need for transparent auditing tools. The findings highlight that “AI-generated” encyclopedias can inherit — or magnify — human editorial biases depending on training and citation-selection practices, making source governance and reproducibility critical design considerations.
Loading comments...
loading comments...