Knowledge-RAG – Local RAG for Claude Code with hybrid search and cross-encoder (github.com)

0 points 98 days ago ago | visit original

🤖 AI Summary

Knowledge-RAG has unveiled a significant update to its local Retrieval-Augmented Generation (RAG) system for Claude Code, simplifying the integration of document searching and retrieval without cloud dependencies. This new version allows users to install the system with a single Python command and eliminates the need for Docker and other complex setups, enhancing user experience by making document retrieval seamless and efficient. The upgrade emphasizes privacy, as all operations are processed locally, ensuring sensitive data remains secure. The major technical improvements include the transition from an external server (Ollama) to an in-process embedding engine (FastEmbed), which uses a smaller but efficient 384-dimensional model that enhances the speed and reduces server management overhead. Knowledge-RAG now utilizes hybrid search combining semantic embeddings and BM25 keyword matching, further refined by a cross-encoder reranking mechanism for improved precision during complex queries. Additionally, it introduces smart document chunking based on Markdown headers, query expansions through synonym mapping, and supports various document formats, making it a robust tool for professionals who need reliable, secure access to their internal documentation.

Loading comments...

loading comments...