Looking for feedback on proposed AI health risk scoring framework (github.com)

0 points 2 hours ago ago | visit original

🤖 AI Summary

A proposed framework called AI Risk Assessment–Health offers a CVSS-style scoring system tailored to LLM risks that affect physical and mental health, especially for children and other vulnerable users. The goal is to prioritize human welfare over technical complexity by giving researchers, regulators, security testers and clinicians a common language to report, triage and remediate harmful AI behavior — from inaccurate medical advice to manipulative conversational bonding — without needing vendor insider access. Technically, the framework scores seven dimensions with 0–3/4 levels: Physical Safety Impact (PhSI), Mental Health Impact (MHI), Vulnerable Population Impact (VPI), Unintentional Triggerability (UT), AI Bonding/Manipulation (MBI), Triggered Proactive Safeguards (TPS) and Triggered Reactive Safeguards (TRS). BaseScore = (PhSI + MHI) * 5, then multiplied/divided by factor tables (e.g., VPI: 0.8/1.0/1.2; UT: 0/1/2/2.5; MBI: 1.0–1.3; TPS/TRS: divisors 1.0–0.85) to produce an intermediate value (0–≈136.5) that is compressed to a 0–10 final score. Scores map to No/Low/Medium/High/Critical risk bands with defined remediation timelines (90+ days down to immediate/0–7 days). The design emphasizes triggerability and emotional bonding as multipliers and allows safeguards to reduce risk, offering a practical, behavior-first metric to standardize incident reporting and safety controls for health-related AI harms.

Loading comments...

loading comments...