Bootstrapping the Language Archive

New prospects for Natural Language Processing in Preserving Linguistic Heritage

Authors

  • Steven Bird University of Melbourne & University of Pennsylvania

DOI:

https://doi.org/10.33011/lilt.v6i.1243

Keywords:

small languages, corpora, field data

Abstract

There are grounds to believe that language technology in general, and natural language processing in particular, have important roles to play in creating and analyzing corpora for small languages. This goes beyond the development of data management tools to the application of natural language processing techniques to small and noisy datasets, and the design of new methods that operate within the constraints of linguistic field data. A set of seven such constraints (or "axioms for scalable work with small languages") are presented, and suggestions for further NLP research are related back to these axioms.

Downloads

Published

2011-10-01

How to Cite

Bird, S. (2011). Bootstrapping the Language Archive: New prospects for Natural Language Processing in Preserving Linguistic Heritage. Linguistic Issues in Language Technology, 6. https://doi.org/10.33011/lilt.v6i.1243