🤖 AI Summary
Teams behind Fraim argue that instead of trying to encode every security policy as brittle, exhaustive SAST rules (Semgrep/Checkov), you can use LLMs to evaluate intent directly. They built a "risk_flagger" workflow that runs an LLM (example: openai/gpt-5, temperature=1) over a git diff to flag custom risks described in plain language. The piece shows that hand-written deterministic rules—e.g., an LLM-generated Checkov rule listing “sensitive” admin ports and explicit 0.0.0.0/0 checks—miss real-world edge cases that humans spot easily: bespoke admin ports (TeamViewer 5938), non-default ports mapped to services (Redis on 7000), and CIDR-splitting workarounds (0.0.0.0/1 + 128.0.0.0/1) that evade simple scanners.
Technically, Fraim’s LLM-based evaluation can inspect diffs, search the repo to map ports to services, and reason about intent rather than exact patterns, producing human-readable findings (severity, location, explanation). The authors note this requires prompt engineering (coaxing the model to call tools) and trade-offs around nondeterminism and reproducibility, but argue LLMs excel at subjective, under-specified policies that are impractical to exhaustively enumerate. For practitioners, the implication is to treat LLM analysis as a complementary layer to deterministic SAST—better at intent and edge cases—but still requiring CI integration, prompt tuning, and guardrails for consistent results.
Loading comments...
login to comment
loading comments...
no comments yet