Subagents with MCP (cra.mr)

🤖 AI Summary
Sentry experimented with an "Agent Mode" for their Model Context Provider (MCP) to eliminate the always-on metadata token hit (~14,000 tokens in their basic Sentry MCP). Instead of exposing many composable tools and heavy metadata directly to third‑party agents, they wrap the MCP into an embedded subagent and expose a single use_sentry tool that forwards the user’s prompt to that subagent. The implementation uses the TypeScript SDK’s in-memory protocol (InMemoryTransport.createLinkedPair), experimental_createMCPClient, and a useSentryAgent that receives the MCP tools and the raw user request. The result: the tool payload dropped to ~720 tokens (~95% reduction) while preserving the ability to surface selective Sentry context (URLs, org/project scoping, search and analytics helpers) to coding agents like Claude Code and Cursor. The tradeoffs are notable. Agent Mode gives much tighter control over tool selection, parameters, and prompting (you can force URL handling or parameter parsing in the subagent), and improves reliability versus copy-pasted stack traces. But it reduces tool composability, introduces prompting quirks (e.g., misinterpreting GitHub URLs), and—most seriously—roughly doubled response times in benchmarks (direct mode avg 11.29s vs agent mode avg 23.81s, ~110% slower), largely due to extra model hops (GPT-5) and orchestration overhead. It’s a promising pattern for lowering token costs and improving agent behavior, but needs tuning for latency, description/prompt engineering, and composability before wider adoption.
Loading comments...
loading comments...