mgrep: searching codebases with embeddings (github.com)

🤖 AI Summary
Mixedbread released mgrep, a CLI-first semantic search tool that brings natural‑language, multimodal search to codebases, docs, images and PDFs (audio/video support coming). Installable via npm and run like classic grep, mgrep provides commands such as mgrep watch to background-index a repo (respecting .gitignore/.mgrepignore), mgrep search "query" [path] for NL queries, and mgrep login or MXBAI_API_KEY for headless auth. Results return skim-friendly context (line ranges, page numbers) and cloud-backed “stores” you can isolate per repo/team, letting agents and teammates query the same corpus without reuploading. Technically, mgrep is powered by Mixedbread Search: state-of-the-art semantic retrieval models plus context-aware parsing, top‑k retrieval with reranking, and optimized inference. It’s built to complement — not replace — traditional grep: use grep for exact matches and mgrep for intent-driven discovery, onboarding and feature exploration. Integrations include a Claude Code plugin (more agent integrations planned), and a 50-task benchmark where mgrep+Claude Code used ~2x fewer tokens than grep-based workflows while matching or improving quality, because semantic retrieval surfaces relevant snippets so models spend capacity on reasoning instead of scanning. For engineers and ML teams building retrieval-augmented agents, mgrep reduces token costs, streamlines repo indexing, and supplies richer contextual hits for downstream reasoning and code agents.
Loading comments...
loading comments...