We tested 20 LLMs for ideological bias, revealing distinct alignments (anomify.ai)

🤖 AI Summary
Researchers ran a controlled, black‑box study of ~20 leading LLMs (including OpenAI, Google, and several open‑source models) to test for ideological bias. They created 24 paired prompts across eight socio‑political dimensions (e.g., Progressive vs Conservative, Market vs State), used a strict system prompt that forced models to reply exactly "a", "b", or "pass", and queried each model 100 times per prompt at temperature 1.0—yielding nearly 50,000 API calls. Temperature was chosen to surface uncertainty (so lower‑probability tokens can appear), and “compliance” rates measured how often models produced valid a/b answers versus passing or returning invalid responses. For open models the team noted that logits and internal vectors could offer finer, numerical measures of leaning, but proprietary APIs required a uniform output‑level comparison. The results show LLMs are not ideologically neutral: distinct “personalities” emerged (some models consistently favored progressive, libertarian, or regulatory answers, others repeatedly refused to choose), and agreement varied sharply by question. Examples include consistent splits where Gemini and ChatGPT family models leaned one way while Claude family models leaned another, and topics like abortion triggered high pass/low compliance rates. The study underscores that model choice can systematically shape the information users receive; for practitioners and policymakers this means bias should be a first‑class evaluation axis, tooling for transparency (e.g., exposing logits or standard bias audits) is needed, and developers must report and mitigate ideological tendencies when deploying LLMs.
Loading comments...
loading comments...