Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus

Authors

  • Nikolett Mus Hungarian Research Institute for Linguistics
  • Hungarian Research Institute for Linguistics Hungarian Research Institute for Linguistics

DOI:

https://doi.org/10.33011/computel.v2i.975

Abstract

In this paper, we report on the main lessons drawn from the first year of a Tundra Nenets (Samoyedic, Uralic) corpus building work carried out in the Hungarian Research Institute for Linguistics. The aim of our work is twofold. First we collect, process and archive written (and in the latter part of the project period spoken) data of Tundra Nenets. Second, we build a parallel corpus, i.e. a Tundra Nenets–Russian corpus, to support and encourage preferably synchronic syntactic research on Tundra Nenets. After discussing certain language and culture specific factors that potentially influence the sampling method, we present the stages of our work in detail.

Downloads

Published

2021-03-02