The focus of AI4Bhārat is on building open-source language AI for Indian languages, including datasets, models, and applications.
Our Mission
Bring parity with respect to English in AI technologies for Indian languages with open-source contributions in datasets, models, and applications and by enabling an innovation ecosystem”
-
Our Impact Axes
Data
AI Models
Applications
Ecosystem
Areas
Open-source datasets (Samanantar) and models (IndicTrans) for neural machine translation between English and 12 Indic languages.
Know More →
Open-source datasets and benchmarks (Aksharantar), models (IndicXlit), and applications for transliteration between Roman and scripts for 20+ Indic languages.
Know More →
Open-source models (IndicWav2Vec) for speech recognition in 9 Indian languages.
Know More →
Open-source language models (IndicBERT), benchmarks (IndicGLUE), and entity recognizers (IndicNER) for 10 Indian languages.
Know More →
Open-source language generation model (IndicBART) and benchmarks (IndicNLG Suite) for 10 Indian languages.
Know More →
Open-source datasets (INCLUDE, SignCorpus) and models (OpenHands) for sign recognition for various 10 sign languages from around the world.
Know More →
Open-source text-to-speech models for 13 Indian languages with support for female and male speakers.
Know More →
Open-source workbench for AI-assisted language work on Indian languages with initial focus on translation.
Know More →
Open-source tool for AI-assisted video subtitling and translating with a focus on educational and media content.
Know More →
Open-sourced tool for document-level translation with NMT and transliteration support.
Know More →
Tools and Models
Tools Site
Models Site
Our Sponsors
As the primary sponsor, Nandan Nilekani has generously contributed to the formation of the "Nilekani center at AI4Bharat" with a focus on open-source language tech as a public good.
Microsoft’s Research Lab and India Development Center (IDC) have supported us with unrestricted research grants and also by allocating time from researchers to contribute towards open-source technologies.
We are also supported by EkStep Foundation with mentorship and software engineering to build and deploy open-source applications for Indian languages.
We receive valuable support from Nvidia India, granting us access to their cutting-edge compute resources and researchers. This collaboration enables us to actively contribute to the development of open-source models.
Our Team
Researcher at Microsoft
Associate Professor at CSE Department, IIT Madras
Researcher at Microsoft Research and Adjunct Faculty at IIT Madras
Chief AI Evangelist, EkStep and Mentor, AI4Bharat
Positions
1-year program for recent graduates to work on cutting-edge research problems in NLP, Speech, and systems-engineering for AI.
1-semester program for current students to work on data and software engineering for language AI technologies
Long-term opportunities for experienced language translators and transcribers across Indian languages.
Long-term opportunities for front-end and back-end developers looking to contribute to building open-source applications for language technologies.
We must be second to none in the application of advanced technologies to the real problems of man and society.”
- Vikram Sarabhai