Learn from Your Mistakes: Tree-Like Self-Play for Secure Code LLMs (arxiv.org)

0 points 1 day ago ago | visit original

🤖 AI Summary

A new framework called Tree-like Self-Play (TSP) has been introduced to enhance the security of Large Language Models (LLMs) in code generation. Traditional alignment techniques often struggle with localized vulnerabilities—critical errors that can arise from a single incorrect token. TSP mitigates this issue by transforming secure code generation into a fine-grained decision-making process, akin to a self-play game where the model autonomously explores secure pathways and recognizes its own mistakes. This approach allows the model to receive a detailed learning signal, specifically at the decision points most likely to introduce vulnerabilities. The significance of TSP lies in its impressive performance improvements; for instance, it increased the pass rate of the CodeLlama-7B model to 75.8% on Python security benchmarks, outshining conventional methods. Moreover, TSP demonstrates robust out-of-distribution generalization by reducing vulnerabilities in unseen categories by 24.5% and transferring learned security principles across multiple programming languages, such as Python, Go, and JavaScript. This suggests that rather than just memorizing fixes, TSP enables models to internalize broader security concepts, which could greatly enhance the reliability and safety of AI-generated code across various applications.

Loading comments...

loading comments...