Vera-MH: open-source eval for chatbot safety in mental health (github.com)

0 points 248 days ago ago | visit original

🤖 AI Summary

Vera‑MH is an open‑source validation toolkit for chatbot safety in mental‑health scenarios (Validation of Ethical and Responsible AI in Mental Health). The repo—published as a work‑in‑progress with an open Request for Comment—provides runnable code, persona datasets, a clinically developed rubric, and judge harnesses so researchers can simulate, log, and evaluate conversations between LLM “patient” and “therapist” agents. Primary entry points are generate.py to run batched, asynchronous simulations and judge.py to score outputs with a configurable judge model. The project integrates LangChain and concrete LLM clients for OpenAI and Anthropic (Claude), supports configurable model settings (temperature, max_tokens), and organizes outputs into timestamped folders with rich logging and performance metrics. Technically, Vera‑MH uses a CSV persona system (demographics, mental‑health context, risk type and acuity, communication style, triggers, sample prompt) injected into prompt templates to produce realistic trajectories. The architecture is modular: an abstract LLM interface, provider implementations, model_config.json, and prompt loaders make it straightforward to add new models or prompts. Conversations support early stopping (personas can signal termination after a minimum of 3 turns), multiple runs per persona, and judge prompts that map to a clinical rubric. By combining reproducible simulation, clinical tagging, and an iterative RFC process, Vera‑MH offers a practical platform for the AI/ML community to stress‑test safety, risk‑detection, and alignment strategies for mental‑health chatbots.

Loading comments...

loading comments...