The low-cost path to AI Mastery (antonyarkov.substack.com)

🤖 AI Summary
Developer demonstrates a low-cost, hands-on path to RAG-style assistants with “Wiki Navigator,” a searchable chatbot built over 9,000 Chromium docs that runs without GPUs and can even be deployed statically in a browser. The project shows you can learn and build practical AI tools by focusing on core building blocks—tokenization, vector embeddings, and cosine similarity—rather than large models or heavy infrastructure. In hours the author created a retriever that returns citation-based answers (preventing hallucinations) and reused the same code to build a Rust-book assistant, proving accessibility for learning and internal doc search. Technically, the pipeline is simple: a training phase converts documents into embeddings (options shown include a hash-based SimpleEmbeddingService, TF‑IDF, or ONNX all‑MiniLM), producing an index; the query phase encodes user questions and compares vectors using cosine similarity to pick Top-K matches. The system enforces consistency by mirroring tokenization and vector math between C# (training) and JavaScript (client runtime), uses a default FAQ similarity threshold (~0.90) with a RAG fallback when confidence is low, and computes real-time in-memory searches. The writeup also flags limits—hash embeddings are simple, transformer embeddings are stronger, and cosine similarity can be gamed—highlighting trade-offs and inviting readers to explore source code and incremental upgrades (ONNX, advanced vector search, Rust tooling).
Loading comments...
loading comments...