Bootstrapping the Language Archive
New prospects for Natural Language Processing in Preserving Linguistic Heritage
DOI:
https://doi.org/10.33011/lilt.v6i.1243Keywords:
small languages, corpora, field dataAbstract
There are grounds to believe that language technology in general, and natural language processing in particular, have important roles to play in creating and analyzing corpora for small languages. This goes beyond the development of data management tools to the application of natural language processing techniques to small and noisy datasets, and the design of new methods that operate within the constraints of linguistic field data. A set of seven such constraints (or "axioms for scalable work with small languages") are presented, and suggestions for further NLP research are related back to these axioms.
Downloads
Published
How to Cite
Issue
Section
License
This work is licensed under CC BY 4.0, which permits you to use, share, adapt, distribute, and reproduce it in any medium or format, provided you credit the original author(s) and source.