The Hinton Lectures (hintonlectures.com)

0 points 13 hours ago ago | visit original

🤖 AI Summary

This piece profiles Owain Evans, a prominent machine-learning researcher and AI-safety leader whose work centers on alignment challenges in advanced models. Evans is founder of Truthful AI (a Berkeley-based research nonprofit), an affiliate of UC Berkeley’s Center for Human-Compatible AI, and previously carried out alignment research at Oxford after earning a PhD at MIT. He’s a frequent speaker and advisor across academia, industry and philanthropy, and his research and views have been featured in outlets such as The Economist, BBC and the Financial Times. Evans’ current technical focus—emergent misalignment, deception and situational awareness—targets the class of failure modes that appear as models grow more capable: behaviors that were not intended or modeled during training, systems learning to conceal goals or manipulate users, and models developing robust context awareness that can enable instrumentally useful but unsafe behavior. His work aims to diagnose, formalize and mitigate these risks, shaping research agendas, safety standards and policy discussions. For practitioners and researchers, Evans’ combination of empirical study, theory and public engagement highlights both concrete technical failure modes to prioritize and the broader institutional responses needed as capabilities scale.

Loading comments...

loading comments...