TY - GEN
T1 - Neural Borrowing Detection with Monolingual Lexical Models
AU - Miller, John E.
AU - Pariasca, Emanuel
AU - Beltrán Castañón, César A.
N1 - Publisher Copyright:
© 2021 Incoma Ltd. All rights reserved.
PY - 2021
Y1 - 2021
N2 - Identification of lexical borrowings, transfer of words between languages, is an essential practice of historical linguistics and a vital tool in analysis of language contact and cultural events in general. We seek to improve tools for automatic detection of lexical borrowings, focusing here on detecting borrowed words from monolingual wordlists. Starting with a recurrent neural network lexical model and competing entropies approach, we incorporate a more current Transformer based lexical model. From there we experiment with several different models and approaches including a lexical donor model with augmented wordlist. The Transformer model reduces execution time and minimally improves borrowing detection, and the augmented donor model shows some promise. A substantive change in approach or model seems necessary for significant gains in detection of lexical borrowings.
AB - Identification of lexical borrowings, transfer of words between languages, is an essential practice of historical linguistics and a vital tool in analysis of language contact and cultural events in general. We seek to improve tools for automatic detection of lexical borrowings, focusing here on detecting borrowed words from monolingual wordlists. Starting with a recurrent neural network lexical model and competing entropies approach, we incorporate a more current Transformer based lexical model. From there we experiment with several different models and approaches including a lexical donor model with augmented wordlist. The Transformer model reduces execution time and minimally improves borrowing detection, and the augmented donor model shows some promise. A substantive change in approach or model seems necessary for significant gains in detection of lexical borrowings.
UR - http://www.scopus.com/inward/record.url?scp=85122955538&partnerID=8YFLogxK
U2 - 10.26615/issn.2603-2821.2021_016
DO - 10.26615/issn.2603-2821.2021_016
M3 - Conference contribution
AN - SCOPUS:85122955538
T3 - International Conference Recent Advances in Natural Language Processing, RANLP
SP - 109
EP - 117
BT - Proceedings of the 1stWorkshop on Multimodal Machine Translation for Low Resource Languages, MMTLRL 2021 in conjunction with International Conference on Recent Advances in Natural Language Processing, RANLP 2021
A2 - Rapp, Reinhard
A2 - Singh, Thoudam Doren
A2 - Espana i Bonet, Cristina
A2 - Rapp, Reinhard
A2 - Bandyopadhyay, Sivaji
A2 - Sharoff, Serge
A2 - Van Genabith, Josef
A2 - Zweigenbaum, Pierre
PB - Incoma Ltd
T2 - 2021 Student Research Workshop, SRW 2021 associated with the International Conference on Recent Advances in Natural Language Processing, RANLP 2021
Y2 - 1 September 2021 through 3 September 2021
ER -