Sarvam AI outperforms Google Gemini and ChatGPT on India-specific AI tasks: What is Sarvam AI and why it matters | – The Times of India


Sarvam AI outperforms Google Gemini and ChatGPT on India-specific AI tasks: What is Sarvam AI and why it matters

In the bustling tech hubs of Bengaluru, a quiet revolution is revealed. Sarvam AI, an AI startup, has reportedly taken the global AI world by storm. With its latest models, Sarvam Vision and Bulbul V3, the company claims to have outperformed some of the biggest names in artificial intelligence, including Google Gemini and OpenAI’s ChatGPT. As per Pratyush Kumar, the vision of Sarvam AI officially released information on X, announcing a state-space-based 3-billion-parameter vision language model with the best results in digitisation in English and other Indian languages.

What is India’s Sarvam AI that beats other AI models

This new AI model extends work in text and voice to visualising the concepts. Their main focus is on solving document intelligence challenges, from the physical documents, archieves and manuscripts, focusing more on Indian languages.This model was trained on high-quality datasets covering 22 official Indian languages, including varied financial documents, literature, newspapers, historic texts, and more. For now, Sarvam AI seems focused on execution rather than hype. With a mix of local insight, cutting-edge AI, and global benchmarking, it might be quietly reshaping how India approaches technology. And for anyone tracking the AI race, it’s worth paying attention.The Document Intelligence API is free for February 2026, allowing users to explore and build with Sarvam Vision at scale, with getting started today for completely free.

Sarvam AI features explained

With a vision for accuracy, especially in understanding Indian languages on global benchmarks, it holds numerous features, including:

  • Multimodal vision-language

This helps in ensuring to understand the images and texts together for enabling the image captioning, chart, or table interpretation more easily.

  • Document understanding (Indian languages focused)

It has high-accuracy OCR and knowledge extraction for 22 Indian languages, including historic texts and scanned documents.

  • Charts and data interpretation

It is capable of understanding more than texts. The charts, data, illustrations, and visual analysis of the documents.Understands and interprets visual elements across multiple languages in the same document.Excels in global English benchmarks and introduces the Sarvam Indic OCR Bench for Indian languages.Document Intelligence APIs are production-ready and free to use for experimentation in February 2026.

Sarvam Vision OCR accuracy level

Sarvam Vision, the company’s OCR model, reportedly scored 84.3% accuracy on olmOCR-Bench, surpassing Gemini 3 Pro and DeepSeek OCR v2. On OmniDocBench v1.5, it achieved an even higher 93.28%. The model handles diverse content types, scanned documents, and complex layouts as per the official Sarvam blog. The team focused not just on technology, but on making it practical for India’s multilingual landscape.The company calls itself a “sovereign” AI. The idea is simple: make AI accessible, reliable, and controlled within India. Their website notes an ambition to build foundational AI components tailored to Indian needs. Sarvam AI’s work hasn’t gone unnoticed.

How Sarvam AI differs from other AI models like Gemini and ChatGPT

The most intriguing feature of this AI model is its focus on Indian languages prioritising English and treating the rest secondary. Since it is trained in 22 Indian languages, which gives high accuracy for regional scripts.While other models are only capable enough to entract text rom documents or images, the Sarvam can interpret visual elements into better understanding and additional knowledge. This ensures better performance on a variety of complex documents in the level of understanding with a large-scale Indic OCR benchmark for Indian languages.

Source link