Am I missing the boat on vector databases for RAG? (littleleaps.substack.com)

🤖 AI Summary
A thoughtful critique argues that for text-heavy RAG applications, dense vector databases aren’t always the right default: traditional full-text search (FTS) using sparse representations and inverted indexes (BM25) can be cheaper, more stable, and sometimes more effective. The author cites DeepMind’s theoretical limits on embedding-based retrieval and MTEB leaderboard variability to show dense embeddings struggle across tasks and domains without task-specific retraining. Practically, many modern “vector DBs” are hybrids where sparse inverted indexes are essentially bolted on to HNSW-style ANN engines optimized for dense vectors, rather than a unified solution. Technical and operational implications matter: dense embeddings require frequent re-indexing when models change, higher compute and storage costs (example: ~700M vectors estimated at ~$45K/month on Milvus vs ~$8.2K/month on Elastic), and less configurability than mature search DSLs. For many RAG flows, LLMs can do query expansion or generate synonyms, and reference implementations (OpenAI, Claude) often fall back to text-based grep/regex search. Dense vector retrieval remains crucial for unlabelled image/video similarity and personalized recommenders, but for pure textual corpora the piece urges careful evaluation of FTS-first or hybrid approaches rather than assuming dense vector DBs are always necessary.
Loading comments...
loading comments...