Designing MCP servers for wide schemas and large result sets (axiom.co)

🤖 AI Summary
Axiom described how they redesigned their MCP server to keep contexts compact for AI assistants that query very wide, high-volume logs. The biggest win was switching tabular payloads from verbose JSON to compact CSV: a 5-row example dropped from 235 tokens (JSON) to 166 tokens (CSV) — ~29% savings (~14 tokens per row) that scales linearly with more rows. They also impose a global cell budget that prioritizes totals and summaries, then evenly distributes remaining cells across data tables, and annotate trimmed results (e.g., “Showing 100 of 2,340 rows...”) so models know they’re seeing a slice. Technically, they use an intelligent column-selection heuristic (prioritize common observability fields, high fill-rate and high-cardinality columns, short names and aggregations) to surface the most informative N columns when schemas have thousands of fields. To avoid wasted upstream work, limits are pushed to the source (LIMITs, cap automatic binning) and they introduced maxBinAutoGroups to bound histogram buckets — testing found ~15 buckets are often sufficient. Defaults are intentionally lean but configurable via URL flags (e.g., ?max_cells=10000, ?tools=core,otel). Trade-offs include lost typing/nesting in CSV, potential need for additional fetches when budgets slice data, and reduced resolution from bin caps — all considered acceptable for predictable latency, token cost control, and better multi-turn agent reasoning.
Loading comments...
loading comments...