🤖 AI Summary
Speakeasy announced Dynamic Toolsets for MCP (available now in Gram) — a hybrid discovery system that slashes token use by orders of magnitude while keeping agents reliable and discoverable. Their benchmarks report up to 160x reductions versus static toolsets and average input-token drops of ~96% (total token reductions ~90%+), with 100% task success across 40–400 tool experiments. The practical payoff: you can expose hundreds of API operations to LLM agents without blowing the context window or resorting to heavyweight “code mode” engineering, making MCP viable for production-scale AI tooling.
The technical approach splits tool interaction into three explicit primitives: search_tools (embeddings-based semantic search augmented with brief category overviews and tag filters like source:hubspot), describe_tools (lazy schema loading so large input schemas are only fetched when needed — schemas often account for 60–80% of static tokens), and execute_tool (run the tool). This yields predictable costs and scaling, at the expense of more LLM calls (2–3x) and ~50% higher execution latency in their tests; typical workflows use 6–8 tool calls (search → describe → execute). Conversation history acts as a cache to reduce repeat costs. The result is a practical, production-ready MCP pattern that balances discoverability and token efficiency without forcing teams into bespoke runtime architectures.
Loading comments...
login to comment
loading comments...
no comments yet