Abstract. Too much information kills information. This common statement applies to huge databases, where state of the art search engines may retrieve hundreds of very similar docum...
Yann Landrin-Schweitzer, Pierre Collet, Evelyne Lu...
Accurate web page classification often depends crucially on information gained from neighboring pages in the local web graph. Prior work has exploited the class labels of nearby p...
Text segmentation is important for text analysis, while text alignment is to determine shared sub-topics among similar documents. Multi-task text segmentation and alignment is the...
A Question Answering (QA) system aims to return exact answers to natural language questions. While today information retrieval techniques are quite successful at locating within l...
We explore the relationship between time and relevance using TREC ad-hoc queries. A type of query is identified that favors very recent documents. We propose a time-based language...