🤖 AI Summary
Nick Radcliffe spent a concentrated month “pair-programming” with Anthropic’s Claude Code (Sonnet 4.5) to revive an old Python App Engine project, producing roughly 20k new lines of code (the repo now totals ~23k Python lines) and 1,731 passing tests. In his setup Claude ran as a terminal app (npm-installed) in three modes—Default (request edits), Accept Edits, and Plan Mode (produce a plan with three approval options)—plus a permissive “--yolo” flag he avoided. Radcliffe reports he personally wrote under a hundred lines of the new code: Claude authored the vast majority. The workflow was highly formalized; he watched and intervened constantly, using git resets and immediate stops whenever Claude made questionable changes.
The experiment shows chat-oriented programming (CHOP) can be effective today but only with heavy human oversight. Claude’s outputs are impressively broad and deep—drawn from massive public-code corpora—yet remain “library-like” knowledge rather than true understanding, producing both high-quality code and brittle, sometimes destructive hallucinations. Radcliffe likens the experience to SAE Level‑3 autonomy: productive but stressful because humans must stay alert to take over. Practical implications for the AI/ML community: LLMs are now viable, high-leverage coding partners for real projects if teams enforce strict SOPs, review diffs, maintain extensive tests, and accept operational costs (stress, energy, and governance).
Loading comments...
login to comment
loading comments...
no comments yet