Machine learning algorithms in various forms are now increasingly being used on a variety of portable devices, starting from cell phones to PDAs. They often form a part of standard...
We investigate the novel problem of event recognition from news webpages. "Events" are basic text units containing news elements. We observe that a news article is always...
This paper describes a novel approach of improving multi-document summarization based on cross-document information extraction (IE). We describe a method to automatically incorpora...
Abstract. As the type of content available on the web is becoming increasingly diverse, a particular challenge is to properly determine the types of documents sought by a user, tha...
Shanu Sushmita, Benjamin Piwowarski, Mounia Lalmas
Many tasks of information extraction or natural language processing have a property that the data naturally consist of several views--disjoint subsets of features. Specifically, a ...
Retrieval in historic documents with non-standard spelling requires a mapping from search terms onto the historic terms in the document. For describing this mapping, we have develo...
The dominant method for evaluating search engines is the Cranfield paradigm, but the existing metrics do not consider some modern search engines features, such as document snippets...
This paper is concerned with relevance ranking in search, particularly that using term dependency information. It proposes a novel and unified approach to relevance ranking using ...
The Slot Filling task requires a system to automatically distill information from a large document collection and return answers for a query entity with specified attributes (`slot...
Zheng Chen, Suzanne Tamang, Adam Lee, Xiang Li, Ma...