Lawsuit: Reddit caught Perplexity “red-handed” stealing data from Google results (arstechnica.com)

0 points 276 days ago ago | visit original

🤖 AI Summary

Reddit has sued AI search startup Perplexity, accusing it and unnamed partners of illegally scraping Reddit content by harvesting Google search results (SERPs) and feeding those snippets into an “answer engine” built on a third‑party large language model. The complaint alleges Perplexity bypassed anti‑scraping protections that Google and Reddit maintain, pointing to a sting where Reddit posted uniquely discoverable content and — within hours — Perplexity returned that exact material in answers. Reddit frames this as a deliberate conspiracy to “steal” content rather than innovate, while Perplexity denies wrongdoing, saying it simply summarizes and cites Reddit threads and doesn’t use Reddit data to train foundational models. The case matters because it tests the legal and technical boundaries between crawling public search results, scraping site content, and using LLMs to synthesize web data. Key technical claims hinge on rapid indexation of SERP‑only content (suggesting automated scraping of Google results), potential circumvention of rate‑limits and bot‑defenses, and the distinction between reproducing source material and generating summaries. If courts back Reddit, AI/ML firms that rely on automated ingestion of indexed search results may face heightened liability, stricter access controls, and pressure to license content or negotiate direct partnerships with platforms — reshaping common data‑collection practices for models and search summarizers.

Loading comments...

loading comments...