Temporal expressions, such as between 1992 and 2000, are frequent across many kinds of documents. Text retrieval, though, treats them as common terms, thus ignoring their inherent semantics. For queries with a strong temporal component, such as U.S. president 1997, this leads to a decrease in retrieval effectiveness, since relevant documents (e.g., a biography of Bill Clinton containing the aforementioned temporal expression) can not be reliably matched to the query. We propose a novel approach, based on language models, to make temporal expressions first-class citizens of the retrieval model. In addition, we present experiments that show actual improvements in retrieval effectiveness. Categories and Subject Descriptors H.3.3 [Information Search and Retrieval]: Retrieval models General Terms Algorithms, Experimentation, Performance Keywords Temporal Information Retrieval, Language modeling
Irem Arikan, Srikanta J. Bedathur, Klaus Berberich