Asterisk: A small text embedding model for low-resource hardware (github.com)

🤖 AI Summary
A newly introduced model called Asterisk* offers a compact 256-dimensional text embedding solution optimized for low-resource hardware. Developed by Andrew Semenov, this model is based on a Transformer encoder and boasts a small footprint of only ~30MB on disk, making it suitable for deployment on edge devices. It features fast inference capabilities with INT8 quantization, facilitating real-time applications like semantic search, clustering, and deduplication of personal notes, all while operating efficiently on CPU environments. The significance of Asterisk* lies in its design philosophy that shifts the focus from larger, resource-intensive models to more accessible, lightweight alternatives. With around 2.6 million parameters, it maintains semantic similarity across extensive data sets, having been trained on a dataset comprising 1.3 million article-summary pairs from the NEWSROOM dataset. Although the current implementation has a higher memory requirement during inference, ongoing improvements aim to optimize this aspect. Overall, Asterisk* marks a step toward enabling more developers to implement AI solutions in constrained environments without sacrificing usability and functionality.
Loading comments...
loading comments...