What happened after 2k people tried to hack my AI assistant (www.fernandoi.cl)

0 points 3 hours ago ago | visit original

🤖 AI Summary

A recent experiment by a developer, designed to test the security of their AI assistant Fiu, attracted over 2,000 participants who collectively sent more than 6,000 emails attempting to extract sensitive information from a secrets.env file. Despite the barrage of creative and sometimes sophisticated prompt injections—including impersonations and social engineering tactics—Fiu maintained its integrity and did not leak any information. The exercise emphasized the importance of securing AI systems as they increasingly manage sensitive data like emails and calendars, highlighting the potential risks if an attacker successfully manipulates them. The experiment also revealed the resilience of powerful AI models like Opus 4.6 against such attacks, as simple, clear anti-prompt injection instructions proved effective. Interestingly, the project's visibility led to sponsorship opportunities, demonstrating community interest in AI security. The developer noted a newfound optimism regarding prompt injection vulnerabilities after observing over 6,000 attempts fail. However, they caution that while the results were promising, AI agents should still be handled with care and not granted arbitrary permissions, particularly as research hints at potential weaknesses in non-English contexts.

Loading comments...

loading comments...