Learnings from 1 year of agents: PostHog AI (posthog.com)

🤖 AI Summary
PostHog launched PostHog AI, an agent built into its analytics platform that can access your product data and execute multi-step analyst tasks—writing SQL, creating charts, setting up feature flags and experiments, and triaging errors—by looping through dozens of tools until the task is done. After a year of iteration and beta use by thousands weekly, the stack settled on Claude Sonnet 4.5 as the core agent loop (with Anthropic's Claude 4 family enabling reliable tool use and OpenAI’s o4-mini providing cost‑effective reasoning for complex query creation), plus a GPT-5-mini powered /init flow to build project-level context from the web. The agent streams every tool call and reasoning token to users for transparency and uses a “todo_write” step as a lightweight but powerful way to keep the LLM on task. Their engineering lessons matter for the wider AI community: modern reasoning models change system behavior dramatically and can simplify architecture; a single LLM loop with persistent history outperforms graph-based orchestrators or layered subagents for free-form work (subagents only help when tasks are truly parallel and self-contained); wide, effortless context is essential to reduce ambiguity; evals are useful but insufficient—real production traces reveal surprising edge cases; and dependency on heavyweight frameworks can become a liability as models and provider APIs evolve. Practical implication: simpler, low‑level orchestration with robust context, streaming visibility, and production trace analysis yields far more reliable agent behavior than elaborate orchestration frameworks.
Loading comments...
loading comments...