Finding Security Bugs in OSS with LLMs on a Budget (www.etive-mor.com)

🤖 AI Summary
In a recent talk at [un]prompted 2026, AI researcher Nicholas Carlini shared his experiences using Anthropic's Claude models to automate the detection of security vulnerabilities in code repositories. His approach, though effective—yielding actual vulnerabilities including a heap buffer overflow in the Linux Kernel—was resource-intensive, potentially costing around $40,000 for comprehensive scanning due to the large number of files and model invocation required. Recognizing this challenge, he proposed leveraging heuristics to identify and prioritize files, significantly reducing costs while maintaining a focus on high-impact areas. Building on Carlini's work, an alternative method was trialed using GitHub Copilot, which managed to detect 20 potential vulnerabilities in a smaller open-source project, Umbraco-CMS, for less than $20. This innovative approach involved an initial phase of generating detailed onboarding documentation with security considerations, allowing targeted scanning of only relevant files rather than the entire codebase. Although it yielded generally high-level insights, this technique demonstrated a promising balance of efficiency and effectiveness, making large-scale bug searches significantly more feasible for budget-conscious developers and organizations within the AI/ML community.
Loading comments...
loading comments...