Hacking OpenAI's Internet Search (www.onyx.app)

🤖 AI Summary
Researcher Yuhong Sun reverse-engineered how major chat systems (OpenAI, Anthropic, Grok, Gemini) implement internet-enabled LLM search and found a surprisingly uniform pattern: the models are given two discrete "actions" — web.search (returns ranked links + snippets/metadata) and web.open_url (fetches the full page) — and the LLM itself decides which queries to run and which pages to open, often in parallel. By probing prompts the author could reliably extract tool names and behavior (even correcting near-miss names like “follow_link” to the real “open_url”), showing prompt-hacking can expose internal tooling and workflows. That repeatability — and evidence of parallel query consolidation — matters because it reveals a common architectural motif across providers and an attack surface for researchers and attackers alike. Technically, the article explains how a production setup is built: use a Web Search API (Google/Serper/Exa) to get snippets, then a scraper (in-house or services like Firecrawl) to retrieve full text, and apply reasoning models or chain-of-thought layers to handle large page context and reduce hallucinations. The practical takeaway for the AI/ML community is twofold: you can reproduce efficient, human-like search behavior by exposing selective tool APIs to LLMs, but doing so requires careful safety, prompt-hardening, and pipeline design (parallel querying, reliable scraping, and reasoning layers) to avoid information leakage and ensure factual responses.
Loading comments...
loading comments...