Testing Sonnet/Opus vs. GPT-5 vs. Code Supernova on real coding tasks (blog.kilocode.ai)

0 points 1 day ago ago | visit original

🤖 AI Summary

Kilo Code’s head-to-head tests found that Code Supernova produces working code 6–10× faster than GPT-5 while delivering UI polish close to Sonnet 4. In timed challenges—building a Postgres-hosting React landing page (17s) and a TypeScript SQLite job queue (20s)—Supernova spit out functional, visually solid results rapidly, but with trade-offs: huge single-file components, copy‑pasted sections, and backend implementations missing transaction rollbacks, job unlocking, cleanup, and robust error propagation. By contrast GPT-5 took minutes but generated more modular, production-ready architectures (transactions, visibility timeouts, ack/fail/release semantics). Technically, Supernova behaves like an execution model that follows instructions verbatim rather than reasoning through architecture or edge cases. It’s fastest in iteration-heavy scenarios (UI component generation, API clients, POCs, quick feature additions, static pages) but scores lowest on robustness and production readiness. Testing also flagged a September 2024 knowledge cutoff (behind Sonnet/Opus), so it may output outdated Next.js/React/TypeScript/Tailwind patterns. Benchmarking matrices showed GPT-5 > Opus 4.1 > Sonnet 4 > Supernova for architecture and robustness, and the reverse for raw speed. Practical takeaway: use Supernova to rapidly prototype and explore visual/layout options, then refactor selected outputs with a planning-oriented model like GPT-5 for production hardening. Supernova is free in Kilo Code now (200k context window, no rate limits), making it useful for fast iteration—but not a drop-in replacement for production, team, or safety-critical code.

Loading comments...

loading comments...