Parallel Syntactic Annotation in CReST

Authors

  • Sandra Kübler Indiana University
  • Eric Baucom Indiana University
  • Matthias Scheutz Tufts University

DOI:

https://doi.org/10.33011/lilt.v7i.1265

Keywords:

treebank, annotation, syntax, dialogue

Abstract

In this paper, we introduce the syntactic annotation of the CReST corpus, a corpus of natural language dialogues obtained from humans performing a cooperative, remote search task. The corpus contains the speech signals as well as transcriptions of the dialogues, which are additionally annotated for dialogue structure, disfluencies, and for syntax. The syntactic annotation comprises POS annotation, Penn Treebank style constituent annotations, dependency annotations, and combinatory categorial grammar annotations. The corpus is the first of its kind, providing parallel syntactic annotation based on three different grammar formalisms for a dialogue corpus. All three annotations are manually corrected, thus providing a high quality resource for linguistic comparisons, but also for parser evaluation across frameworks.

Downloads

Published

2012-01-01

How to Cite

Kübler, S., Baucom, E., & Scheutz, M. (2012). Parallel Syntactic Annotation in CReST. Linguistic Issues in Language Technology, 7. https://doi.org/10.33011/lilt.v7i.1265