Claude Code uses WebFetch vs. WebSearch (observations, schemas, prompts) (mikhail.io)

0 points 3 hours ago ago | visit original

🤖 AI Summary

An inspection of Claude Code’s runtime reveals two distinct built-in web tools with complementary roles: WebFetch and WebSearch. WebFetch accepts a required URL (<= ~2,000 chars) plus a user prompt and returns a concise answer (not raw HTML). The pipeline validates/normalizes the URL, consults a domain_info endpoint to enforce a deny-list, follows same-host redirects (cross-host redirects return metadata), caches fetches for ~15 minutes, and limits fetch size (~10 MB at fetch, later truncated). HTML is converted to Markdown (Turndown) and truncated to ~100 KB of text before a small, fast model (Haiku 3.5) summarizes and answers with strict rules (empty system prompt, caps verbatim quotes to 125 chars, bans lyrics/legal advice) to reduce prompt-injection and copyright risk. WebSearch, by contrast, accepts a query and optional allow/block lists and returns lightweight results (title + url). Although the search API includes page_age and encrypted_content, Claude Code only extracts titles and URLs; fetching actual content requires a separate WebFetch call. The server-side WebSearch is available on Anthropic’s API but omitted when using platforms like Bedrock/Vertex. For agent builders and the AI/ML community this is a pragmatic architecture: separate lightweight discovery (WebSearch) from controlled content ingestion (WebFetch) to minimize cost, token use, and attack surface. Key implications include explicit schemas for tool calls, conservative quoting/IP policies enforced early via Haiku, predictable caching/redirect semantics for reproducibility, and dependence on Anthropic’s server-side search availability — all useful design patterns for safe, cost-aware web-enabled agents.

Loading comments...

loading comments...