In recent years tree kernels have been proposed for the automatic learning of natural language applications. Unfortunately, they show (a) an inherent super linear complexity and (...
The degree of dominance of a sense of a word is the proportion of occurrences of that sense in text. We propose four new methods to accurately determine word sense dominance using...
In this paper we extend the maximum spanning tree (MST) dependency parsing framework of McDonald et al. (2005c) to incorporate higher-order feature representations and allow depen...
We present a model for sentence compression that uses a discriminative largemargin learning framework coupled with a novel feature set defined on compressed bigrams as well as dee...
I propose a uniform approach to the elimination of redundancy in CCG lexicons, where grammars incorporate inheritance hierarchies of lexical types, defined over a simple, feature-...
Term translation probabilities proved an effective method of semantic smoothing in the language modelling approach to information retrieval. We use Generalized Latent Semantic Ana...
Web text has been successfully used as training data for many NLP applications. While most previous work accesses web text through search engine hit counts, we created a Web Corpu...
Most state-of-the-art evaluation measures for machine translation assign high costs to movements of word blocks. In many cases though such movements still result in correct or alm...
Current Named Entity Recognition systems suffer from the lack of hand-tagged data as well as degradation when moving to other domain. This paper explores two aspects: the automati...
In this paper, we present a formalization of grammatical role labeling within the framework of Integer Linear Programming (ILP). We focus on the integration of subcategorization i...