Designing agentic loops (simonwillison.net)

0 points 3 hours ago ago | visit original

🤖 AI Summary

Coding agents like Anthropic’s Claude Code and OpenAI’s Codex CLI have moved beyond passive code generation to actively running, testing and iterating on code — a pattern the author names an “agentic loop”: an LLM that runs tools in a loop to reach a clear goal. That capability unlocks powerful brute‑force workflows (run tests, tweak Dockerfiles, benchmark SQL, upgrade dependencies) but forces tradeoffs between safety and productivity. Agents defaulting to ask-for-approval are safer but slow; “YOLO” mode (auto-approve every command) is far more effective for exploration yet exposes serious risks: destructive shell commands, data exfiltration (source, secrets), and using your machine as an attack proxy. The practical guidance centers on designing the loop and surface area: choose minimal, well-documented tools (AGENTS.md examples or single CLI incantations), prefer shell-level commands and isolated environments (GitHub Codespaces, containers, or remote machines), and limit credentials to staging/test scopes with strict budgets (the author’s Fly.io $5 org example). Ensure robust automated tests so agents can validate changes automatically. This emergent skill — coined now because Claude Code is so new — matters because it amplifies developer productivity while demanding disciplined sandboxing, credential scoping, and clear success criteria for safe, repeatable automation.

Loading comments...

loading comments...