Volume 25 Fifty Years of CRIL
Working Papers

Computational Morphology for Language Documentation and Description

Sarah Moeller
University of Colorado
Published August 22, 2021
Keywords
  • language documentation,
  • natural language processing,
  • low-resource languages,
  • machine learning,
  • NLP

Abstract

During this time of heightened interest in computational methods for low-resource languages an important question needs to explored: What [computational] methods...can detect [morphological] structure in small, noisy data sets, while being directly applicable to a wide variety of languages?” (Bird, 2009). This paper provides an overview of common natural language processing (NLP) methods and how those methods have been applied to the study of morphology, particularly in low-resource languages (LRL). For NLP, work with LRL is still uncharted territory. This paper explores the possibilities of training computational morphological models on data produced by language documentation and description (LDD) field projects. Models and techniques are identified that seem likely to be successful if integrated into linguistic fieldwork.