From HashHop to Memory-Augmented Language Models (huggingface.co)

0 points 142 days ago ago | visit original

🤖 AI Summary

In August 2024, San Francisco startup Magic unveiled a groundbreaking language model featuring a 100 million token context window—remarkably larger than competitors like Gemini. Despite over $500 million in funding, the startup has since gone silent, offering only intriguing demos and a benchmark called HashHop, which they open-sourced on GitHub. This benchmark challenges existing models by requiring them to handle incompressible hash pairs, exposing their true retrieval capabilities without allowing for shortcuts in pattern matching. Magic's claims of 95% accuracy with their LTM-2-mini model at 100 million tokens have spurred the AI community's interest in exploring this unique retrieval mechanism. Researchers, inspired by Magic's HashHop, have developed the Memory-Augmented Language Model (MALM), applying the insight of treating retrieval keys as single tokens for efficient code retrieval. This approach simplifies the complex mapping of tokens, enhancing the model's capability to perform tasks like function retrieval with remarkable accuracy. Trained on diverse code datasets, MALM demonstrates potential in real-world applications, although challenges remain in production deployment and handling real-world code intricacies. The work underscores the significance of innovative tokenization strategies in AI and ML, revealing pathways for future advancements in retrieval-based language models.

Loading comments...

loading comments...