We investigate the connection between part of speech (POS) distribution and content in language. We define POS blocks to be groups of parts of speech. We hypothesise that there ex...
In this paper we propose a small set of lexical conceptual relations which allow to encode adjectives in computational relational lexica in a principled and integrated way. Our ma...
In this paper we investigate how to automatically determine if two document collections are written from different perspectives. By perspectives we mean a point of view, for examp...
We present a perceptron-style discriminative approach to machine translation in which large feature sets can be exploited. Unlike discriminative reranking approaches, our system c...
A query speller is crucial to search engine in improving web search relevance. This paper describes novel methods for use of distributional similarity estimated from query logs in...
This paper presents a discriminative pruning method of n-gram language model for Chinese word segmentation. To reduce the size of the language model that is used in a Chinese word...
Event-based summarization attempts to select and organize the sentences in a summary with respect to the events or the sub-events that the sentences describe. Each event has its o...
This paper shows that a simple two-stage approach to handle non-local dependencies in Named Entity Recognition (NER) can outperform existing approaches that handle non-local depen...
Due to the historical and cultural reasons, English phases, especially the proper nouns and new words, frequently appear in Web pages written primarily in Asian languages such as ...
away concepts from the surface form of the text. The authors argue that while there has been research into automatic classification, general classification schemes are unsuitable f...