🤖 AI Summary
A lightweight Rust proxy called Claude Code Mux has been released to let Claude Code clients transparently multiplex requests across many LLM providers, with intelligent routing, streaming, and automatic failover. It exposes the Anthropic/Claude Messages API so you can point Claude to http://127.0.0.1:13456 and the mux will route sub-tasks to the best model (e.g., GLM‑4.6 for websearch, Kimi K2 for reasoning, Minimax M2 for general responses) or automatically fall back to secondary providers when a primary is unavailable. The project ships with a modern web UI (auto-save, URL navigation), a live test console, and centralized regex-based config for auto-mapping and background-task detection — no JSON edits or server restarts required for most changes.
Technically, the proxy is Rust 1.70+ and claims ~5 MB RAM usage and <1 ms routing overhead. It supports SSE streaming, full Anthropic API compatibility, regex auto-mapping, priority-based model/provider mappings (example: glm-4.6 → primary zai, fallback OpenRouter), and routing triggers for web_search tools, thinking flags (/plan), CCM-SUBAGENT-MODEL tags, and background regex patterns. It supports 16+ providers (Anthropic, OpenAI, OpenRouter, z.ai, Minimax, Kimi, Groq, etc.), enabling cost and reliability optimizations (examples: Minimax M2 ~$0.30/$1.20 per M tokens, GLM‑4.6 ~$0.60/$2.20 vs Claude Sonnet much higher). This makes production coding-assistant and multi-agent workflows more resilient, cheaper, and easier to operate.
Loading comments...
login to comment
loading comments...
no comments yet