Codex, Opus, Gemini Try to Build Counter Strike (www.instantdb.com)

0 points 228 days ago ago | visit original

🤖 AI Summary

Three large LLMs—Gemini 3 Pro, Codex Max 5.1, and Claude Opus 4.5—were each given seven consecutive prompts to build a browser-based, 3D, multiplayer Counter‑Strike–style demo (three.js frontend; Instant presence for networking; persistence for maps). Tasks were split into frontend (map physics, characters, POV gun, sounds/animations) and backend (real‑time presence, shooting, room persistence, multi‑map UI). All three produced working multiplayer FPS prototypes with zero hand‑written code, but with distinct strengths: Claude Opus 4.5 produced the best visuals, characters, guns and animations; Gemini 3 Pro was strongest on backend logic, multiplayer presence and persistence (including keeping map IDs in the URL); Codex Max 5.1 was a consistent middle performer. Technically, the team used top‑tier model plans and default CLIs, and observed different problem‑solving styles: Codex introspected TypeScript libraries, Claude leaned heavily on docs (but hit React useEffect pitfalls causing duplicate canvases/animation refs), and Gemini iterated quickly by running builds and fixing TypeScript errors. All models generated DB schemas, migrations and seeded maps, but subtle DX issues—React hooks, refactoring for multiple rooms, and the gap from “vibe coding” to production‑grade engineering—remain. The experiment highlights meaningful progress in LLM-assisted end‑to‑end development while underscoring the need for better tooling and guardrails for reliability and non‑developer users.

Loading comments...

loading comments...