IndicVoices – AI4BHĀRAT

IndicVoices: The Journey

Welcome to the incredible journey of IndicVoices, a monumental endeavor funded by Bhashini, under the Ministry of Electronics and Information Technology, Government of India, and generously supported by Ekstep Foundation and Nilekani Philanthropies. Our ambitious mission? To collect spontaneous speech data across the rich tapestry of Indian languages, while honoring the vast linguistic, cultural, and demographic diversity that India boasts. This quest took us on an exhilarating adventure across the country, from the snowy peaks of the north to the sun-kissed shores of the south, amassing a staggering 7348 hours of read, extempore, and conversational audio from 16,237 speakers, spanning 145 Indian districts and 22 languages. The journey has just begun and we are committed to our goal of capturing ~17,000 hours of voice data across more than 400 districts in India.

The scale of this project was nothing short of epic, involving a dedicated army of 1893 individuals, including language experts, local mobilizers, coordinators, quality control experts, transcribers, language leads, and project managers, As we embarked on this ambitious journey, we didn't just collect data; we collected stories, laughter, and the myriad voices of India, making IndicVoices not just a project but a life-changing expedition for everyone involved.

Join us as we share some of the most unforgettable moments and experiences from this journey, offering a glimpse into the heart and soul of India through the voices of its people.

Our journey commenced in Madurai, Tamil Nadu, famous for the Meenakshi Amman Temple. Seeking blessings from the Devi, we commenced our journey with high hopes and a clear vision. However, this pilot quickly became a reality check, challenging our assumptions at every turn, especially regarding participant mobilization. Despite the dense population of India, finding willing participants became unexpectedly difficult. Trust was a major hurdle; many were hesitant to share personal information during registration, fearing potential fraud, especially when digital transactions were mentioned. This skepticism slowed down the mobilization process significantly, making it challenging to achieve the desired diversity in age and gender ratios.

Additionally, the time commitment required from participants—sometimes extending up to four hours to complete the recording process—added another layer of complexity. This duration, much longer than anticipated, tested the patience and commitment of our participants. Hesitancy in speaking freely was another obstacle; many participants showed reluctance in opening up, leading to numerous retakes to capture responses that were natural and usable. This reluctance often resulted in responses that lacked depth and spontaneity, necessitating multiple attempts to elicit more meaningful dialogue. The culmination of these challenges not only extended the duration of our pilot but also highlighted the importance of building trust and ensuring clarity in communication to facilitate smoother data collection processes in the diverse linguistic landscape of India.