Using a tiny GPT model to beat Brotli/ZSTD, 600x faster than Fabrice Bellard's (github.com)

0 points 22 days ago ago | visit original

🤖 AI Summary

A new neural compression API named compress.zip has been unveiled, showcasing remarkable performance with a tiny GPT model that outpaces traditional compression algorithms like Brotli and ZSTD by 600 times. This API allows users to upload a text corpus, which it uses to train a custom tokenizer and compression model, achieving compression and decompression speeds of 30-50 MB/s on GPU. The project includes CPU reference implementations in Rust and Python, enabling developers to produce identical output for validation and independent testing. The significance of compress.zip lies in its potential to revolutionize how data compression is carried out, particularly in resource-constrained environments. With all computations conducted using integer math, it eliminates issues associated with floating-point discrepancies, making it compatible across diverse hardware platforms—from x86 to ARM, and even microcontrollers. This expanded accessibility enhances the model's applicability and reliability, enabling developers to integrate it seamlessly into existing systems. Furthermore, the approach of using fixed-point arithmetic and precomputed lookup tables for complex functions ensures bit-exact results, paving the way for more efficient data management in various AI and machine learning applications.

Loading comments...

loading comments...