Why Do LLMs Design Mediocre Architecture? (www.recurse.ml)

🤖 AI Summary
An experiment using Claude Code to add email notifications to a FastAPI example app shows why LLMs often produce “mediocre architecture”: the model copied an existing pattern into six nearly identical functions (generate_new_account_email, generate_profile_update_email, generate_admin_profile_update_email, etc.) instead of introducing a reusable abstraction. The generated code also contained two real defects—catching bare Exception and logging it at info (which hides bugs) and referencing HTML templates that don’t exist—while ignoring the prompt’s broader requirements (multi-channel notifications, robust retry/failure handling). This conservative behaviour stems from models’ tendency to mirror existing code (“sycophancy”), optimizing for local consistency rather than global design quality. That failure mode matters because it compounds technical debt as teams scale LLM-assisted development. The practical implication: reviewers should stop focusing on line-level fixes and shift to high-level design decisions—whiteboard before you code, insist on abstractions for repeatable patterns, and treat LLM output as implementation, not architecture. Operationally, add automated checks for recurring mistakes (e.g., forbid bare Exception handlers), rely on tests/linters/Recurse.ml for low-level correctness, and convert common PR comments into custom rules. When implementation is cheap, refactoring becomes attainable—so use LLMs to offload rote work and reserve human effort for the hard, subjective design choices that determine long-term maintainability.
Loading comments...
loading comments...