Could Sarvam 30B/105B Models Be India's Answer to DeepSeek and Mistral? (shivekkhurana.com)

0 points 5 days ago ago | visit original

🤖 AI Summary

At India's AI Impact Summit, Sarvam AI launched two ambitious foundation models, the 30B and 105B parameter models, specifically designed for 22 Indian languages and reportedly trained on domestic infrastructure. Despite claims of outperforming established models like Gemini Flash and GPT-120B on local benchmarks, detailed technical documentation, benchmarking scores, and model weights have yet to be released, leading to skepticism within the AI community regarding their validity and readiness for deployment. The 30B model employs a mixture-of-experts architecture to activate only 1 billion parameters per token, which aims to optimize inference costs in a price-sensitive market. The significance of Sarvam's announcement lies not just in the potential capabilities of their models but also in India's broader strategy to establish sovereign AI technology and reduce dependency on foreign infrastructure. The Indian government has invested heavily in the IndiaAI Mission, equipping local companies with resources like GPUs to support domestic AI development. While the existing speech-to-text and text-to-speech products show promise and competitive pricing, aligning with the Indian market's needs, the success of Sarvam's larger claims hinges on forthcoming verifications. Until then, the models remain in question, highlighting the fine line between innovation and marketing hype in the rapidly evolving AI landscape.

Loading comments...

loading comments...