T5Gemma 2: The next generation of encoder-decoder models (blog.google)

🤖 AI Summary
T5Gemma 2 has been announced as the latest evolution in the encoder-decoder model family, building on the innovations from Gemma 3. This new model introduces significant advancements, including being the first multi-modal and long-context encoder-decoder model, efficiently handling both text and images. By adopting tied word embeddings and a merged self- and cross-attention mechanism, T5Gemma 2 reduces parameter count and improves computational efficiency, making models available in compact sizes like 270M-270M and 1B-1B, ideal for on-device applications. The significance of T5Gemma 2 lies in its enhanced capabilities for multimodal processing and extended context management, supporting up to 128K tokens. It also boasts improved multilingual support, handling over 140 languages. Benchmarks indicate that T5Gemma 2 outperforms its predecessor, Gemma 3, in various tasks, including visual question answering and long-context reasoning. With pre-trained checkpoints now available for developers, T5Gemma 2 is poised to facilitate rapid experimentation and innovation within the AI/ML community, driving forward research and practical applications across multiple domains.
Loading comments...
loading comments...