Cursor Composer: Building a fast frontier model with RL (cursor.com)

0 points 1 day ago ago | visit original

🤖 AI Summary

Cursor announced Composer, a new mixture-of-experts agent model optimized for software engineering and interactive speed: on Cursor’s internal benchmarks it delivers “frontier” coding quality while generating about four times faster than comparable models. Composer is built specifically to keep developers “in the flow” by combining long-context understanding with fast, tool-driven actions — it’s already being used internally at Cursor for day-to-day development and evaluated on Cursor Bench, a dataset of real agent requests with hand-curated ideal solutions that measures both correctness and adherence to a codebase’s abstractions and practices. Technically, Composer is trained with reinforcement learning in realistic coding environments where it can read and edit files, run terminal commands, and use codebase-wide semantic search. RL rewards prioritize efficient tool use, parallelism, and concise, well-evidenced responses; during training the model autonomously learned behaviors like complex searches, fixing linter errors, and writing/executing unit tests. Scaling Composer required custom infra: PyTorch+Ray for asynchronous RL, MXFP8 MoE kernels, expert + hybrid sharded parallelism to run on thousands of GPUs without expensive communication, and hundreds of thousands of sandboxed VMs to simulate agent interactions. The result is a highly specialized, fast agent suitable for interactive coding workflows — trading off some top-end model accuracy (GPT-5/Sonnet remain ahead) for substantially lower latency and real-world developer utility.

Loading comments...

loading comments...