Liberating Search from the Search Engine (softwaredoug.com)

🤖 AI Summary
A thoughtful call-to-arms for moving boosting, reranking and most business logic out of heavyweight search engines and into a client-side search API: instead of relying on complex, engine-specific DSLs, fetch a top-N candidate set (e.g., 500–1,000) from Elasticsearch/Vespa/Weaviate and perform token-aware reranking, BM25 rescoring, ML model ranking and paging in your application layer. The author argues this yields predictable L0 load, less vendor lock-in, ubiquitous Python ML tooling for ranking, clearer business logic, and straightforward fallbacks to the raw top-N when the pipeline fails. Technically this requires tooling to stream candidates into a top-N buffer, retain token/term-vector views (or ship term vectors into the client), compute per-field BM25 or other lexical scores, and track global statistics like document frequency (tracked in Redis or similar) for TF-IDF/BM25 approximations. Paging becomes a stateful cache problem (ZADD/ZPOPMAX or other heaps), and the client library would need to manage TTLs and determinism for pagination. Trade-offs include reimplementing paging/aggregations and accepting some DF inaccuracy versus sharded engine counts, but the payoff is simpler, predictable retrieval queries and full control over ML reranking and business logic—making this approach a practical best practice rather than a “dirty hack.”
Loading comments...
loading comments...