There often exist multiple corpora for the same natural language processing (NLP) tasks. However, such corpora are generally used independently due to distinctions in annotation s...
Availability of labeled language resources, such as annotated corpora and domain dependent labeled language resources is crucial for experiments in the field of Natural Language ...
In data-oriented language processing, an annotated language corpus is used as a stochastic grammar. The most probable analysis of a new sentence is constructed by combining fragme...
In this paper, we present a quantitative comparison between the syntactic structures of three languages: English, Chinese and Korean. This is made possible by first extracting Lex...
Fei Xia, Chung-hye Han, Martha Stone Palmer, Aravi...
This paper presents a corpus study of parenthetical constructions in two different corpora: the Penn Discourse Treebank (PDTB, (PDTBGroup, 2008)) and the RST Discourse Treebank (C...