Spell-checking based on syllabification and character-level graphs for a peruvian agglutinative language

Carlo Alva, Arturo Oncevay-Marcos

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

7 Citas (Scopus)

Resumen

There are several native languages in Peru which are mostly agglutinative. These languages are transmitted from generation to generation mainly in oral form, causing different forms of writing across different communities. For this reason, there are recent efforts to standardize the spelling in the written texts, and it would be beneficial to support these tasks with an automatic tool such as a spell-checker. In this way, this spelling corrector is being developed based on two steps: An automatic rule-based syllabification method and a character-level graph to detect the degree of error in a misspelled word. The experiments were realized on Shipibo-konibo, a highly agglutinative and Amazonian language, and the results obtained have been promising in a dataset built for the purpose.

Idioma originalInglés
Título de la publicación alojadaEMNLP 2017 - 1st Workshop on Subword and Character Level Models in NLP, SCLeM 2017 - Proceedings of the Workshop
EditoresManaal Faruqui, Hinrich Schutze, Isabel Trancoso, Yaghoobzadeh Yadollah
EditorialAssociation for Computational Linguistics (ACL)
Páginas109-116
Número de páginas8
ISBN (versión digital)9781945626913
DOI
EstadoPublicada - 2017
EventoEMNLP 2017 1st Workshop on Subword and Character Level Models in NLP, SCLeM 2017 - Copenhagen, Dinamarca
Duración: 7 set. 2017 → …

Serie de la publicación

NombreEMNLP 2017 - 1st Workshop on Subword and Character Level Models in NLP, SCLeM 2017 - Proceedings of the Workshop

Conferencia

ConferenciaEMNLP 2017 1st Workshop on Subword and Character Level Models in NLP, SCLeM 2017
País/TerritorioDinamarca
CiudadCopenhagen
Período7/09/17 → …

Huella

Profundice en los temas de investigación de 'Spell-checking based on syllabification and character-level graphs for a peruvian agglutinative language'. En conjunto forman una huella única.

Citar esto