Corpus creation and initial SMT experiments between Spanish and Shipibo-Konibo

Ana Paula Galarreta, Andrés Melgar, Arturo Oncevay-Marcos

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

25 Citas (Scopus)

Resumen

In this paper, we present the first attempts to develop a machine translation (MT) system between Spanish and Shipibo-konibo (es-shp). There are very few digital texts written in Shipibo-konibo and even less bilingual texts that can be aligned, hence we had to create a parallel corpus using both bilingual and monolingual texts. We will describe how this corpus was made, as well as the process we followed to improve the quality of the sentences used to build a statistical MT model or SMT. The results obtained surpassed the baseline proposed (dictionary based) and made a promising result for further development considering the size of corpus used. Finally, it is expected that this MT system can be reinforced with the use of additional linguistic rules and automatic language processing functions that are being implemented.

Idioma originalInglés
Título de la publicación alojadaInternational Conference on Recent Advances in Natural Language Processing
Subtítulo de la publicación alojadaMeet Deep Learning, RANLP 2017 - Proceedings
EditoresGalia Angelova, Kalina Bontcheva, Ruslan Mitkov, Ivelina Nikolova, Irina Temnikova
EditorialIncoma Ltd
Páginas238-244
Número de páginas7
ISBN (versión digital)9789544520489
DOI
EstadoPublicada - 2017
Evento11th International Conference on Recent Advances in Natural Language Processing, RANLP 2017 - Varna, Bulgaria
Duración: 2 set. 20178 set. 2017

Serie de la publicación

NombreInternational Conference Recent Advances in Natural Language Processing, RANLP
Volumen2017-September
ISSN (versión impresa)1313-8502

Conferencia

Conferencia11th International Conference on Recent Advances in Natural Language Processing, RANLP 2017
País/TerritorioBulgaria
CiudadVarna
Período2/09/178/09/17

Huella

Profundice en los temas de investigación de 'Corpus creation and initial SMT experiments between Spanish and Shipibo-Konibo'. En conjunto forman una huella única.

Citar esto