Findings of the AmericasNLP 2024 Shared Task on Machine Translation into Indigenous Languages

Abteen Ebrahimi, Ona de Gibert, Raúl Vázquez, Rolando Coto-Solano, Pavel Denisov, Robert Pugh, Manuel Mager, Arturo Oncevay, Luis Chiruzzo, Katharina von der Wense, Shruti Rijhwani

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

1 Cita (Scopus)

Resumen

This paper presents the findings of the third iteration of the AmericasNLP Shared Task on Machine Translation. This year’s competition features eleven Indigenous languages found across North, Central, and South America. A total of six teams participate with a total of 157 submissions across all languages and models. Two baselines – the Sheffield and Helsinki systems from 2023 – are provided and represent hard-to-beat starting points for the competition. In addition to the baselines, teams are given access to a new repository of training data which consists of data collected by teams in prior shared tasks. Using ChrF++ as the main competition metric, we see improvements over the baseline for 4 languages: Chatino, Guarani, Quechua, and Rarámuri, with performance increases over the best baseline of 4.2 ChrF++. In this work, we present a summary of the submitted systems, results, and a human evaluation of system outputs for Bribri, which consists of both (1) a rating of meaning and fluency and (2) a qualitative error analysis of outputs from the best submitted system.

Idioma originalInglés
Título de la publicación alojadaAmericasNLP 2024 - 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas - Proceedings of the Workshop
EditoresManuel Mager, Abteen Ebrahimi, Shruti Rijhwani, Arturo Oncevay, Luis Chiruzzo, Robert Pugh, Katharina von der Wense, Katharina von der Wense
EditorialAssociation for Computational Linguistics (ACL)
Páginas236-246
Número de páginas11
ISBN (versión digital)9798891761087
DOI
EstadoPublicada - 2024
Evento4th Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2024 - Mexico City, México
Duración: 21 jun. 2024 → …

Serie de la publicación

NombreAmericasNLP 2024 - 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas - Proceedings of the Workshop

Conferencia

Conferencia4th Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2024
País/TerritorioMéxico
CiudadMexico City
Período21/06/24 → …

Huella

Profundice en los temas de investigación de 'Findings of the AmericasNLP 2024 Shared Task on Machine Translation into Indigenous Languages'. En conjunto forman una huella única.

Citar esto