Sarvam Launches 24B Parameter AI Model to Boost Indian Language and AI Development

Sarvam, the startup selected to build India's foundational language model under the IndiaAI Mission, has launched Sarvam-M, a hybrid language model with 24 billion parameters.

Sarvam, the startup selected to build India’s foundational language model under the IndiaAI Mission, has launched Sarvam-M, a hybrid language model with 24 billion parameters. Built on top of the Mistral Small model, Sarvam-M is designed to push the boundaries of Indian language understanding, mathematics, and programming tasks. It aims to set new benchmarks for AI capabilities, with a focus on Indian languages. The model incorporates one-third of the training data in languages such as Hindi, Tamil, Telugu, and others, covering over 70% of India’s population. Sarvam aims to build sovereign AI for India, complementing efforts like BharatGen’s Param 1, and is actively working on larger models for multimodal AI integration. The company also seeks to empower developers with tools for fine-tuning, prompt engineering, and real-time deployment, positioning itself as a competitive force in the growing AI industry.

Sarvam-M: Powering India’s AI Sovereignty with a 24 B-Parameter Multilingual Model Built for Scale and Purpose

Sarvam-M: Powering India’s AI Sovereignty with a 24 B-Parameter Multilingual Model Built for Scale and Purpose

The model is optimised for a range of use cases, including conversational AI, machine translation, and educational tools. Sarvam-M can now be accessed through the Sarvam API, making it available for developers and companies to integrate into their applications. The team behind Sarvam-M has shared insights into the fine-tuning and reinforcement learning techniques they used to enhance the model, with a particular focus on improving its performance in coding and mathematical reasoning.

A key aspect of the model’s development is its emphasis on Indian languages. About 30% of the training data focused on coding, math, and reasoning prompts, while the remaining 50% included translations into various Indian languages. Hindi made up 28% of the Indic data, with other languages like Bengali, Gujarati, Kannada, and Tamil each contributing around 8%. Together, these languages represent the primary languages spoken by more than a billion people in India.

Sarvam’s goal with this release is to contribute to the creation of a sovereign AI ecosystem in India. Co-founder Vivek Raghavan expressed his enthusiasm for the launch, describing Sarvam-M as a key step in this journey. This development is part of a broader initiative by Sarvam AI to enhance India’s AI capabilities. Earlier this month, Sarvam also launched Bulbul, a speech AI model supporting 11 Indian languages, offering human-like, region-specific voice interactions.

The IndiaAI Mission is not stopping here, with Sarvam also working on a 70-billion parameter multimodal AI model that will support both Indian languages and English. In parallel, other initiatives like BharatGen are contributing to India’s AI landscape, with BharatGen’s Param 1 model launched recently. This rapid development of AI models in India reflects a growing effort to foster technological independence and make AI more accessible to the country’s diverse population.

💡 Found this helpful? Click below to share it with your network and spread the value:
Havilah Mbah
Havilah Mbah

Havilah is a staff writer at The Algorithm Daily, where she covers the latest developments in AI news, trends, and analysis. Outside of writing, Havilah enjoys cooking and experimenting with new recipes.

Leave a Reply

Your email address will not be published. Required fields are marked *