AI Mode can now help you search and explore visually (blog.google)

🤖 AI Summary
Google announced a major upgrade to AI Mode in Search that makes visual exploration conversational and multimodal: you can type or speak a vague idea, upload or snap a photo, and get a curated set of images you can iteratively refine (e.g., “more dark tones and bold prints” or “more ankle length”). The experience surfaces shoppable visual results tied to retailer links, powered by Google’s Shopping Graph (over 50 billion product listings, refreshed at a rate of ~2 billion listings per hour), so users can click through and buy directly. The feature is rolling out in English in the U.S. this week. Technically, the update combines Google Lens and Image Search with Gemini 2.5’s multimodal language capabilities and a new “visual search fan-out” technique: the system detects primary and secondary objects in an image, spawns multiple background queries, and fuses those signals to better match nuanced natural-language prompts. That multi-query, multimodal retrieval pipeline is significant for the AI/ML community because it demonstrates large-scale, real-time fusion of visual understanding and semantic search—improving context-aware retrieval, e-commerce relevance, and interactive image QA—while also highlighting engineering challenges in indexing, latency, and multimodal evaluation at web scale.
Loading comments...
loading comments...