Contextualized Chunk Embedding (docs.voyageai.com)

0 points 4 days ago ago | visit original

🤖 AI Summary

Voyage launched contextualized chunk embeddings (voyage-context-3), a purpose-built embedding model that encodes chunks together so each chunk’s vector reflects its document-level context. The model supports up to 32,000 token context windows and output dimensions of 2048, 1024 (default), 512, or 256. It’s available via the Voyage Python package (voyageai.Client.contextualized_embed()) and a POST endpoint (/v1/contextualizedembeddings), and integrates with TypeScript. The API accepts inputs as a List[List[str]] where each inner list is typically ordered chunks from a single document (or a single-element list for context-agnostic queries), recommends no chunk overlap, and enforces limits (≤1,000 inputs, ≤120k total tokens, ≤16k chunks). This matters for RAG and retrieval systems because contextualized chunk embeddings mitigate information loss caused by independent chunking—improving retrieval relevance and enabling parent-document retrieval while preserving chunk order and provenance. Practical options include input_type (None/query/document) which prepends retrieval prompts, custom chunk_fn (defaults to a LangChain recursive splitter), and multiple output_dtype choices (float, int8/uint8, binary/ubinary) for precision vs. storage/latency tradeoffs. Responses return ordered embeddings, chunk_texts (if chunking was used), and metadata like total_tokens, making voyage-context-3 a drop-in, production-ready alternative to standard embeddings for long-document, multilingual retrieval.

Loading comments...

loading comments...