Representation of Yine (Arawak) Morphology by Finite State Transducer Formalism

Adriano M. Ingunza, John E. Miller, Arturo Oncevay, Roberto Zariquiey

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

We represent the complexity of Yine (Arawak) morphology with a finite state transducer (FST) based morphological analyzer. Yine is a low-resource indigenous polysynthetic Peruvian language spoken by approximately 3,000 people and is classified as ‘definitely endangered’ by UNESCO. We review Yine morphology focusing on morphophonology, possessive constructions and verbal predicates. Then we develop FSTs to model these components proposing techniques to solve challenging problems such as complex patterns of incorporating open and closed category arguments. This is a work in progress and we still have more to do in the development and verification of our analyzer. Our analyzer will serve both as a tool to better document the Yine language and as a component of natural language processing (NLP) applications such as spell checking and correction.

Original languageEnglish
Title of host publicationProceedings of the 1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021
EditorsManuel Mager, Arturo Oncevay, Annette Rios, Ivan Vladimir Meza Ruiz, Alexis Palmer, Graham Neubig, Katharina Kann
PublisherAssociation for Computational Linguistics (ACL)
Pages102-112
Number of pages11
ISBN (Electronic)9781954085442
StatePublished - 2021
Event1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021 - Virtual, Online
Duration: 11 Jun 2021 → …

Publication series

NameProceedings of the 1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021

Conference

Conference1st Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2021
CityVirtual, Online
Period11/06/21 → …

Fingerprint

Dive into the research topics of 'Representation of Yine (Arawak) Morphology by Finite State Transducer Formalism'. Together they form a unique fingerprint.

Cite this