Speech Recognition

AI technologies for speech recognition hold promise to enable a variety of use-cases, including providing input tools for lower literacy users. AI4Bharat has a deep focus on building state-of-the-art speech recognition in many Indian languages and in ensuring that the models are deployable on mobile devices.

Datasets

Dhwani Dataset

17,000 hours of raw speech data for 40 Indian languages from a wide variety of domains including education, news, technology, and finance

Know more

IndicSUPERB

A benchmark of speech recognition tasks including ASR, speaker verification, speaker identification, language identification, query by example, and keyword detection for 12 Indian languages.

Know more

Shrutilipi

Over 6,400 hours of labelled audio across 12 Indian languages mined and aligned from audio broadcasts and PDF transcripts from All India Radio.

Know more

Svarah

Over 6,400 hours of labelled audio across 12 Indian languages mined and aligned from audio broadcasts and PDF transcripts from All India Radio.

Shrutilipi

Over 6,400 hours of labelled audio across 12 Indian languages mined and aligned from audio broadcasts and PDF transcripts from All India Radio.

Know more

Models

On-Device ASR

Much smaller ASR models which can be quantized and executed on Android devices to support privacy-preserving inference on personal devices.

IndicWav2Vec

State-of-the-art open-source ASR models for 9 languages (including Nepali and Sinhala) as measured on public benchmarks.

Know more

Tools

Chitralekha

Chitralekha is an open source platform for video subtitling across various Indic languages, using ML model support (ASR for Transcription, NMT for Translation and TTS for Voice Over). Chitralekha offers support for multiple input sources (Ex: Youtube, local etc), transcription generation process (Ex: Models, Source captions, Custom subtitle files etc) and voice over(Ex: mp3 for audio only, mp4 for audio-video combination, etc)

Know more

Our Partners

DesiCrew

We are working with Desicrew to collect voice samples from across 500 districts in the country. We are beginning with TamilNadu in August 2022.

Karya

Karya is an application developed at Microsoft Research and maintained by Karya Inc. We are using Karya to collect voice samples from across the country.

NPTEL

We are working with NPTEL to deploy the Chitralekha tool for subtitling and translating higher education videos.