Apple CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning (github.com)

🤖 AI Summary
Apple has officially released CLaRa, a cutting-edge open-source Retrieval-Augmented Generation (RAG) model designed to enhance large language models with external knowledge. This significant advancement addresses key challenges in existing RAG frameworks, such as the inefficiencies of long contexts and the separate training of retrievers and compressors, which often leads to poor semantic preservation. By utilizing a novel three-stage training approach, CLaRa achieves impressive compression rates of 32x-64x while maintaining crucial information for high-quality answer generation. The model's three-stage training process includes compression pretraining with semantic supervision, instruction tuning for instruction-following tasks, and end-to-end fine-tuning that facilitates unified optimization of retrieval and generation. This innovative design allows for more efficient learning and has been shown to outperform previous models such as PISCO and LLMLingua-2 across several benchmarks, including multi-hop question-answering datasets. With its reliable performance and open-source availability on Hugging Face, CLaRa is poised to significantly impact the AI/ML community by enhancing capabilities in knowledge retrieval and generation, ultimately pushing the boundaries of how language models can harness external information.
Loading comments...
loading comments...