The Arabic Treebank team at the Linguistic Data Consortium has significantly revised and enhanced its annotation guidelines and procedure over the past year. Improvements were mad...
The paper describes the project whose main purpose is the creation of the Slovene terminology web portal, funded by the Slovene Research Agency and the Amebis software company. It...
In this paper two highly innovative digital editions will be presented. For the creation and the implementation of these editions the latest developments within corpus research ha...
This paper, the 5th in a series of biennial progress reports, reviews the activities of the Linguistic Data Consortium with particular emphasis on general trends in the language r...
Arrau is a new corpus annotated for anaphoric relations, with information about agreement and explicit representation of multiple nts for ambiguous anaphoric expressions and disco...
This paper explores how a battery of unsupervised techniques can be used in order to create large, high-quality corpora for textual inference applications, such as systems for rec...
We present an experiment in extracting collocations from the FrameNet corpus, specifically, support verbs such as direct in Environmentalists directed strong criticism at world le...
This paper introduces Saxon, a rule based document annotator that is capable of processing and annotating several document formats and media, both within and across documents. Fur...
This paper introduces a knowledge representation formalism used for annotation of the French MEDIA dialogue corpus in terms of high level semantic structures. The semantic annotat...