Lexeme: the Concept of System and the Creation of Speech Corpora for Two Endangered Languages

Authors

  • Ivan Ubaleht Omsk State Technical University

DOI:

https://doi.org/10.33011/computel.v2i.981

Abstract

In this paper we present the concept of the Lexeme system. Lexeme is a new application for managing speech corpora for endangered languages. Currently, the Lexeme system is under development. Furthermore, we present the first results of the creation of speech corpora for Siberian Ingrian Finnish and Siberian Tatar. These languages are endangered languages. The speech data of these languages were published, are accessible to the public, and are licensed under a Creative Commons Attribution 4.0 license.

Downloads

Published

2021-03-02