Perplexity Is a Bullshit Machine (2024) (www.wired.com)

🤖 AI Summary
Perplexity, a fast‑growing “answer engine” backed by high‑profile investors, is under scrutiny after WIRED and independent researchers found evidence its chatbot surreptitiously scrapes publisher content it claims to respect — and then produces answers that sometimes closely paraphrase, misattribute or hallucinate facts. Investigation shows a Perplexity‑linked IP (44.221.181.252) repeatedly accessing Condé Nast sites despite robots.txt blocks and server‑level 403 forbids, and Perplexity has removed a formerly published list of its crawler IPs. Tests also suggest the UI’s “reading” indicator can misrepresent whether the system accessed full articles or reconstructed content from search engine snippets/metadata. Publishers including Forbes and WIRED have complained about attribution and potential copyright issues; Perplexity has discussed revenue‑share talks but denied wrongdoing. Technically, Perplexity isn’t primarily a new foundation model but a wrapper that fetches “real‑time” web material and feeds it into selectable LLMs (its Sonar Large 32k built on Meta’s LLaMA 3 plus OpenAI/Anthropic options for Pro users). The core implications are clear: opaque crawling practices break web norms (robots.txt), blur provenance of LLM outputs, increase legal and ethical risk for both publishers and downstream users, and expose a reliability problem—models that present sourced, real‑time answers can still plagiarize or hallucinate. The episode highlights the need for stronger crawler transparency, provenance guarantees, and publisher‑aligned business models for web‑grounded AI services.
Loading comments...
loading comments...