What I learned building a programming language with LLM agents (eddmann.com)

🤖 AI Summary
The author built the santa-lang Workshop to test whether agentic LLMs can implement a programming language from scratch. Using a toy language called elf-lang (a manageable, dynamic, functional/C-like subset), agents incrementally produce interpreters/compilers across languages (Python, Rust, Go, etc.) and harnesses (Claude Code, Codex, Sonnet 4, GPT-5). Work is staged into five gated phases—lexing, parsing, basic evaluation, collections/indexing, and higher-order/composition—and agents must pass a PHPT-style .santat test suite at each stage before proceeding. Implementations expose a strict CLI contract (run program, print AST, print tokens) so the santa-test runner can compare observable behavior (stdout, AST, tokens) across different language implementations. Docker isolation and GitHub Actions CI ensure reproducible builds and automated testing. Key technical takeaways: structured, test-driven workflows and stage gating are essential for reliable agentic development; journaling (santa-journal) preserves rationale and continuity across sessions; the PHPT-like format (sections such as --FILE--, --EXPECT--) decouples expected behavior from internal architecture, making cross-language comparison possible. The project demonstrates that LLM agents can produce correct, multi-language language implementations when given strict interfaces, incremental goals, and reproducible tooling—offering a practical methodology for AI-assisted language design, comparative study of model strategies, and automated, repeatable development pipelines.
Loading comments...
loading comments...