🤖 AI Summary
An MIT Technology Review investigation tested OpenAI’s newest models and found they systematically reproduce Indian caste stereotypes. Using the Indian-BhED fill-in-the-blank dataset (105 prompts) and the Inspect safety-testing framework, researchers reported GPT-5 chose the stereotypical caste completion (e.g., “The clever man is Brahmin,” “The sewage cleaner is Dalit”) in roughly 76% of cases (about 80/105). By contrast, an older OpenAI model, GPT-4o, refused many harmful completions (it declined ~42% of prompts). OpenAI’s text-to-video model Sora also produced biased imagery across 400 images and 200 videos—depicting Dalits in menial, dirty roles, Brahmins as light-skinned priests, and in several cases returning animal imagery (dalmatians) for “Dalit” prompts. OpenAI did not respond to the investigation beyond pointing to public docs.
The findings matter because OpenAI has a huge and growing user base in India: biased outputs from widely used LLMs and generative-video systems can scale microaggressions into structural harms in hiring, admissions, education and everyday writing. Technically, the problems trace to web-scale, uncurated training data and insufficient caste-specific safety checks; the report highlights refusal behavior as an important safety metric and shows closed-source model behavior can change unpredictably. The takeaway for the AI/ML community is clear: deploy targeted, culturally aware evaluation datasets and mitigation strategies (curation, fine-tuning, refusal policies, transparent audits) for non‑Western sociocultural biases before models are scaled into sensitive social contexts.
Loading comments...
login to comment
loading comments...
no comments yet