To evaluate language models on Indic languages, we need a robust human-annotated NLU benchmark consisting of 9 tasks across 18 Indic languages.
Datasets
Dataset Descriptions and Examples
Corresponding authors: Sumanth Doddapaneni
If you are using any of the resources, please cite the following article:
@misc{https://doi.org/10.48550/arxiv.2212.05409,
doi = {10.48550/ARXIV.2212.05409},
url = {https://arxiv.org/abs/2212.05409},
author = {Doddapaneni, Sumanth and Aralikatte, Rahul and Ramesh, Gowtham and Goyal, Shreya and Khapra, Mitesh M. and Kunchukuttan, Anoop and Kumar, Pratyush},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {IndicXTREME: A Multi-Task Benchmark For Evaluating Indic Languages},
publisher = {arXiv},
year = {2022},
copyright = {arXiv.org perpetual, non-exclusive license}
}
IndicXTREME is released under this licensing scheme: