Linking endangerment databases and descriptive linguistics: An assessment of the use of terms relating to language endangerment in grammars

Roberto Zariquiey, Mónica Arakaki, Javier Vera, Guido Torres-Orihuela, Claret Cuba-Raime, Carlos Barrientos, Aracelli García, Adriano Ingunza, Harald Hammarström

Research output: Contribution to journalArticlepeer-review

Abstract

The world harbours a diversity of some 6,500 mutually unintelligible languages. As has been increasingly observed by linguists, many minority languages are becoming endangered and will be lost forever if not documented. The increased urgency has led to the development of several global endangerment databases and a more fine-grained understanding of the language endangerment progression as well as its possible reversal. In the present paper, we explore the terminological correlates of this development as found in the descriptive linguistic literature, using a corpus of over 10,000 digitized grammatical descriptions. Comparing this with existing endangerment databases, we find that simply counting terms related to endangerment does signal endangerment, but the degree of endangerment is more difficult to assess from grammatical descriptions. The label endangered seems to be an umbrella term that covers different situations ranging from moribund languages with less than ten speakers to minority languages with several thousand speakers. For many languages considered endangered in existing databases, explicit terms to this effect cannot be found in their descriptions. The discrepancy is due to incompleteness of the searchterm set, gaps in the literature, and projected rather than observed information in the databases. Our explorations illustrate the potential for database curation assisted by computational searches both to maintain accuracy of the databases and to investigate assumed language endangerment. Future work includes a larger cloud of search terms, usage of term frequencies, and prescreening of descriptive literature for the existence of a relevant section. From the perspective of descriptive linguistics, this study calls for a more careful correlation between the language endangerment indexes, as developed in the global endangerment databases, and the treatment of the endangerment status of individual languages in descriptive grammars.

Original languageEnglish
Pages (from-to)290-318
Number of pages29
JournalLanguage Documentation and Conservation
Volume16
StatePublished - 2022

Fingerprint

Dive into the research topics of 'Linking endangerment databases and descriptive linguistics: An assessment of the use of terms relating to language endangerment in grammars'. Together they form a unique fingerprint.

Cite this