TY - GEN
T1 - Corpus creation and initial SMT experiments between Spanish and Shipibo-Konibo
AU - Galarreta, Ana Paula
AU - Melgar, Andrés
AU - Oncevay-Marcos, Arturo
N1 - Publisher Copyright:
© 2018 Association for Computational Linguistics (ACL). All rights reserved.
PY - 2017
Y1 - 2017
N2 - In this paper, we present the first attempts to develop a machine translation (MT) system between Spanish and Shipibo-konibo (es-shp). There are very few digital texts written in Shipibo-konibo and even less bilingual texts that can be aligned, hence we had to create a parallel corpus using both bilingual and monolingual texts. We will describe how this corpus was made, as well as the process we followed to improve the quality of the sentences used to build a statistical MT model or SMT. The results obtained surpassed the baseline proposed (dictionary based) and made a promising result for further development considering the size of corpus used. Finally, it is expected that this MT system can be reinforced with the use of additional linguistic rules and automatic language processing functions that are being implemented.
AB - In this paper, we present the first attempts to develop a machine translation (MT) system between Spanish and Shipibo-konibo (es-shp). There are very few digital texts written in Shipibo-konibo and even less bilingual texts that can be aligned, hence we had to create a parallel corpus using both bilingual and monolingual texts. We will describe how this corpus was made, as well as the process we followed to improve the quality of the sentences used to build a statistical MT model or SMT. The results obtained surpassed the baseline proposed (dictionary based) and made a promising result for further development considering the size of corpus used. Finally, it is expected that this MT system can be reinforced with the use of additional linguistic rules and automatic language processing functions that are being implemented.
UR - http://www.scopus.com/inward/record.url?scp=85040557019&partnerID=8YFLogxK
U2 - 10.26615/978-954-452-049-6_033
DO - 10.26615/978-954-452-049-6_033
M3 - Conference contribution
AN - SCOPUS:85040557019
T3 - International Conference Recent Advances in Natural Language Processing, RANLP
SP - 238
EP - 244
BT - International Conference on Recent Advances in Natural Language Processing
A2 - Angelova, Galia
A2 - Bontcheva, Kalina
A2 - Mitkov, Ruslan
A2 - Nikolova, Ivelina
A2 - Temnikova, Irina
PB - Incoma Ltd
T2 - 11th International Conference on Recent Advances in Natural Language Processing, RANLP 2017
Y2 - 2 September 2017 through 8 September 2017
ER -