AI4Bharat

Speech Synthesis

To know more about our contributions over the years see the timeline below!

At AI4Bharat, we are advancing text-to-speech (TTS) technology for Indian languages. We’ve evaluated TTS models for Dravidian and Indo-Aryan languages, finding that FastPitch and HiFi-GAN V1 outperform existing systems. Our open-source models and datasets, including Rasa—the first multilingual expressive TTS dataset for Assamese, Bengali, and Tamil—show significant improvements in expressiveness and practical solutions for resource constraints. To address out-of-vocabulary (OOV) issues in low-resource languages like Hindi and Tamil, we propose a cost-effective strategy using volunteer-recorded data to enhance OOV performance without compromising quality. We also restored the largest multilingual Indian TTS dataset, featuring 1,704 hours of high-quality speech from 10,496 speakers across 22 languages. These efforts are pivotal for advancing TTS technology in India's diverse linguistic landscape.

.css-79elbk{position:relative;}Speech Synthesis

Timeline

Speech Synthesis