This paper addresses the issue of devising a new document prior for the language modeling (LM) approach for Information Retrieval. The prior is based on term statistics, derived in...
Terminologies and other knowledge resources are widely used to aid entity recognition in specialist domain texts. As well as providing lexicons of specialist terms, linkage from t...
Angus Roberts, Robert Gaizasukas, Mark Hepple, Yik...
Our KNOWITALL system aims to automate the tedious process of extracting large collections of facts (e.g., names of scientists or politicians) from the Web in an autonomous, domain...
Oren Etzioni, Michael J. Cafarella, Doug Downey, A...
The difficulty with information retrieval for OCR documents lies in the fact that OCR documents comprise of a significant amount of erroneous words and unfortunately most informat...
We present a type and effect system for flow analysis that makes essential use of higher-ranked polymorphism. We show that, for higher-order functions, the expressiveness of highe...