Leaked Meta guidelines show how it trains AI chatbots to respond to child sexual exploitation prompts (www.businessinsider.com)

🤖 AI Summary
A leaked internal Meta document obtained by Business Insider reveals updated training and evaluation guidelines telling contractors how the company’s AI chatbot must handle child sexual exploitation prompts. The document explicitly instructs the bot to refuse any prompt requesting sexual roleplay involving minors and prohibits content that describes or endorses sexual relationships between children and adults, depicts children in pornography or sexual services, or provides instructions for obtaining CSAM. It allows factual, educational discussions about abuse (e.g., general grooming behaviors or academic treatment) but forbids step‑by‑step enabling information; roleplay is permitted only when characters are explicitly 18+, and non‑sensual romance can appear only as framed literature. The leak follows earlier Reuters reporting that prompted FTC scrutiny and congressional demands (Sen. Josh Hawley) for Meta’s rulebooks and arrives as Meta begins handing over documents. For the AI/ML community this signals a tightened, operationalized approach to safety policies under regulatory and public pressure. The guidance shows how safety intent is translated into testable rules — labeled acceptable/unacceptable cases, concrete response examples, and defined verb meanings (e.g., “describe,” “discuss,” “enable”) — which shape prompt classifiers, fine‑tuning datasets, rejection logic, and human‑in‑the‑loop contractor evaluations. Practically, teams building conversational systems must incorporate clearer refusal behaviors, narrowly scoped educational responses, age‑gating heuristics, and audit trails to demonstrate compliance with legal and ethical standards.
Loading comments...
loading comments...