We Tested DeepSeek V4 Pro and Flash Against Claude Opus 4.7 and Kimi K2.6 (blog.kilo.ai)

🤖 AI Summary
The recent launch of DeepSeek V4 Pro and DeepSeek V4 Flash marks a significant advancement in the open-weight AI landscape. Released on April 24, 2026, these models are designed to compete with established models like Claude Opus 4.7 and Kimi K2.6, featuring a new architecture under an MIT license. In a comparative test using a rigorous FlowGraph spec, DeepSeek V4 Pro scored 77/100, outperforming Kimi K2.6 (68) while falling short of Claude Opus 4.7 (91). Meanwhile, DeepSeek V4 Flash, priced at just $0.02 per run, scored 60/100 but exhibited several critical output issues, including the failure to start workflow runs, which highlights its current limitations despite its affordability. For the AI/ML community, this testing underscores the narrowing performance gap between open-weight models and proprietary alternatives, particularly concerning cost-to-quality ratios. DeepSeek V4 Pro demonstrates a compelling upgrade over Kimi K2.6 in terms of overall infrastructure and reliability, especially given the promotional pricing that positions it competitively on cost. In contrast, DeepSeek V4 Flash introduces a new budget tier that, while not fully reliable for complex tasks, offers an unprecedented low-cost point for first-pass attempts in backend builds, potentially changing how developers approach testing and iteration in AI-driven applications.
Loading comments...
loading comments...