Findings of the Second AmericasNLP Competition on Speech-to-Text Translation

Abteen Ebrahimi, Manuel Mager, Adam Wiemerslage, Pavel Denisov, Arturo Oncevay, Danni Liu, Sai Koneru, Enes Yavuz Ugan, Zhaolin Li, Jan Niehues, Monica Romero, Ivan G. Torre, Tanel Alumäe, Jiaming Kong, Sergey Polezhaev, Yury Belousov, Wei Rui Chen, Peter Sullivan, Ife Adebara, Bashar TalafhaAlcides Alcoba Inciarte, Muhammad Abdul-Mageed, Luis Chiruzzo, Rolando Coto-Solano, Hilaria Cruz, Sofía Flores-Solórzano, Aldo Andrés Alvarez López, Ivan Meza-Ruiz, John E. Ortega, Alexis Palmer, Rodolfo Zevallos, Kristine Stenzel, Thang Vu, Katharina Kann

Research output: Contribution to journalConference articlepeer-review

Abstract

Indigenous languages, including those from the Americas, have received very little attention from the machine learning (ML) and natural language processing (NLP) communities. To tackle the resulting lack of systems for these languages and the accompanying social inequalities affecting their speakers, we conduct the second AmericasNLP competition (and the first one in collaboration with NeurIPS), which is centered around speech-to-text translation systems for Indigenous languages of the Americas. The competition features three tasks – (1) automatic speech recognition, (2) text-based machine translation, and (3) speech-to-text translation – and two tracks: constrained and unconstrained. Five Indigenous languages are covered: Bribri, Guarani, Kotiria, Wa’ikhana, and Quechua. In this overview paper, we describe the tasks, tracks, and languages, introduce the baseline and participating systems, and end with a summary of ongoing and future challenges for the automatic translation of Indigenous languages.

Original languageEnglish
Pages (from-to)217-232
Number of pages16
JournalProceedings of Machine Learning Research
Volume220
StatePublished - 2023
Event36th Annual Conference on Neural Information Processing Systems, NeurIPS 2022 - Virtual, Online, United States
Duration: 28 Nov 20229 Dec 2022

Keywords

  • Indigenous languages
  • automatic speech recognition
  • low-resource languages
  • low-resource machine translation
  • machine translation
  • natural language processing
  • speech-to-text translation

Fingerprint

Dive into the research topics of 'Findings of the Second AmericasNLP Competition on Speech-to-Text Translation'. Together they form a unique fingerprint.

Cite this