In this paper we bring to light a novel intersection between corpus linguistics and behavioral data that can be employed as an evaluation metric for resources for low-density lang...
Within the EU-funded COMPANIONS project, we are working to evaluate new collaborative conversational models of dialogue. Such an evaluation requires us to benchmark approaches to ...
Nick Webb, David Benyon, Jay Bradley, Preben Hanse...
The Brandeis Annotation Tool is a web-based text annotation tool that is centered around the notions of layered annotation and task decomposition. It allows annotations to refer t...
Propp's influential structural analysis of fairy tales created a powerful schema for representing storylines in terms of character functions, which is directly exploitable fo...
CzEng 0.9 is the third release of a large parallel corpus of Czech and English. For the current release, CzEng was extended by significant amount of texts from various types of so...
In this paper, we investigate the acoustic properties of phonemes in three speaking styles: read speech, prepared speech and spontaneous speech. Our aim is to better understand wh...
We propose a strategy to reduce the impact of the sparse data problem in the tasks of lexical information acquisition based on the observation of linguistic cues. It justifies tha...
We describe the re-annotation of selected types of named entities (persons, organizations, locations) from the MUC7 corpus. The focus of this annotation initiative is on recording...
The definition of lexical semantic similarity measures has been the subject of lots of works for many years. In this article, we focus more specifically on distributional semantic...
In this paper, we discuss our analysis and resulting new annotations of Penn Discourse Treebank (PDTB) data tagged as Concession. Concession arises whenever one of the two argumen...