Rust port of sentence-transformers using the Candle framework (github.com)

0 points 2 hours ago ago | visit original

🤖 AI Summary

A new Rust port of sentence-transformers built on the Candle framework provides a native, production-friendly way to generate sentence embeddings in Rust. It ships with first-class support for popular models such as sentence-transformers/all-MiniLM-L6-v2, all-MiniLM-L12-v2, LaBSE, paraphrase-multilingual and mpnet variants, BAAI/bge-small-en-v1.5, and intfloat’s multilingual E5 family, plus additional architectures. Because it plugs into candle_core for device management, you can run embeddings on CUDA GPUs or CPU, and the API (SentenceTransformerBuilder) supports large token batch sizes, normalization, and cosine-similarity utilities out of the box. Technically, the crate can also load any model built on BertModel, XLMRobertaModel, or DistilBertModel with minimal extra configuration; more bespoke architectures (MPNetMaskedLM, NomicBertModel, Gemma3TextModel) are explicitly supported. The builder lets you choose safetensors vs PyTorch checkpoints, specify pooling and dense-layer folders (useful for multi-component HF hub exports), and tune batch and device parameters. Significance: this enables low-latency, memory-safe embedding workflows in Rust applications—ideal for production services, embedding databases, and edge/CPU deployment—while preserving interoperability with Hugging Face model configs and modern model families.

Loading comments...

loading comments...