We present a semi-supervised machine-learning approach for the classification of adjectives into property- vs. relationdenoting adjectives, a distinction that is highly relevant f...
We present a method for acquiring reliable predicate-argument structures from raw corpora for automatic compilation of case frames. Such lexicon compilation requires highly reliab...
A novel method to automatically associate ontological concepts to their realisations in texts is presented. The method has been developed in the context of the Papyrus project to ...
The goal of DARPA's Machine Reading (MR) program is nothing less than making the world's natural language corpora available for formal processing. Most text processing r...
Stephanie Strassel, Dan Adams, Henry Goldberg, Jon...
The paper presents an innovative approach to extract Slovene definition candidates from domain-specific corpora using morphosyntactic patterns, automatic terminology recognition a...
This paper presents a novel system HENNA (Hybrid Person Name Analyzer) for identifying language origin and analyzing linguistic structures of person names. We conduct ME-based cla...
This paper presents a system for querying treebanks in a uniform way. The system is able to work with both dependency and constituency based treebanks in any language. We demonstr...
This paper presents the development of an open-source Spanish Dependency Grammar implemented in FreeLing environment. This grammar was designed as a resource for NLP applications ...
The Quranic Arabic Dependency Treebank (QADT) is part of the Quranic Arabic Corpus (http://corpus.quran.com), an online linguistic resource organized by the University of Leeds, a...
CASIA-CASSIL is a large-scale corpus base of Chinese human-human naturally-occurring telephone conversations in restricted domains. The first edition consists of 792 90-second con...