🤖 AI Summary
Inception, a startup led by Stanford professor Stefano Ermon, raised $50 million in seed funding (Menlo Ventures lead; participants include Mayfield, Nvidia’s NVentures, M12, Snowflake Ventures, Databricks, with angels Andrew Ng and Andrej Karpathy) to build diffusion-based models for code and text. The company also released an updated Mercury model aimed at software development and already integrated into tools such as ProxyAI, Buildglare, and Kilo Code. Ermon says the diffusion approach will cut latency and compute costs compared with today’s autoregressive LLMs, positioning Mercury as a production-ready alternative for developer workflows.
Technically, diffusion models generate outputs through iterative, holistic refinement rather than the token-by-token prediction used by autoregressive models (GPT-style). That structural difference enables parallel processing across many operations, yielding much higher throughput and lower latency for large-context tasks—Inception reports benchmarks over 1,000 tokens/sec. This could be especially advantageous for operations over large codebases, constrained-data settings, and hardware efficiency. If diffusion architectures scale as claimed, they may challenge the autoregressive orthodoxy for certain text and code applications by offering faster, more compute-efficient inference and different trade-offs for model design and infrastructure.
Loading comments...
login to comment
loading comments...
no comments yet