TY - GEN
T1 - Ship-lemmatagger
T2 - 20th International Conference on Text, Speech and Dialogue, TSD 2017
AU - Pereira-Noriega, José
AU - Mercado-Gonzales, Rodolfo
AU - Melgar, Andrés
AU - Sobrevilla-Cabezudo, Marco
AU - Oncevay-Marcos, Arturo
N1 - Publisher Copyright:
© Springer International Publishing AG 2017.
PY - 2017
Y1 - 2017
N2 - Natural Language Processing deals with the understanding and generation of texts through computer programs. There are many different functionalities used in this area, but among them there are some functions that are the support of the remaining ones. These methods are related to the core processing of the morphology of the language (such as lemmatization) and automatic identification of the part-of-speech tag. Thereby, this paper describes the implementation of a basic NLP toolkit for a new language, focusing in the features mentioned before, and testing them in an own corpus built for the occasion. The obtained results exceeded the expected results and could be used for more complex tasks such as machine translation.
AB - Natural Language Processing deals with the understanding and generation of texts through computer programs. There are many different functionalities used in this area, but among them there are some functions that are the support of the remaining ones. These methods are related to the core processing of the morphology of the language (such as lemmatization) and automatic identification of the part-of-speech tag. Thereby, this paper describes the implementation of a basic NLP toolkit for a new language, focusing in the features mentioned before, and testing them in an own corpus built for the occasion. The obtained results exceeded the expected results and could be used for more complex tasks such as machine translation.
KW - Lemmatization
KW - Low resource language
KW - Part-of-speech tagging
KW - Shipibo-konibo
UR - http://www.scopus.com/inward/record.url?scp=85028645758&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-64206-2_53
DO - 10.1007/978-3-319-64206-2_53
M3 - Conference contribution
AN - SCOPUS:85028645758
SN - 9783319642055
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 473
EP - 481
BT - Text, Speech, and Dialogue - 20th International Conference, TSD 2017, Proceedings
A2 - Ekstein, Kamil
A2 - Matousek, Vaclav
PB - Springer Verlag
Y2 - 27 August 2017 through 31 August 2017
ER -