Sciweavers

495 search results - page 85 / 99
» Improving the Boosted Correlogram
Sort
View
ECIR
2008
Springer
13 years 9 months ago
Probabilistic Document Length Priors for Language Models
This paper addresses the issue of devising a new document prior for the language modeling (LM) approach for Information Retrieval. The prior is based on term statistics, derived in...
Roi Blanco, Alvaro Barreiro
LREC
2008
105views Education» more  LREC 2008»
13 years 9 months ago
Combining Terminology Resources and Statistical Methods for Entity Recognition: an Evaluation
Terminologies and other knowledge resources are widely used to aid entity recognition in specialist domain texts. As well as providing lexicons of specialist terms, linkage from t...
Angus Roberts, Robert Gaizasukas, Mark Hepple, Yik...
AAAI
2004
13 years 9 months ago
Methods for Domain-Independent Information Extraction from the Web: An Experimental Comparison
Our KNOWITALL system aims to automate the tedious process of extracting large collections of facts (e.g., names of scientists or politicians) from the Web in an autonomous, domain...
Oren Etzioni, Michael J. Cafarella, Doug Downey, A...
DRR
2003
13 years 9 months ago
Information retrieval for OCR documents: a content-based probabilistic correction model
The difficulty with information retrieval for OCR documents lies in the fact that OCR documents comprise of a significant amount of erroneous words and unfortunately most informat...
Rong Jin, ChengXiang Zhai, Alexander G. Hauptmann
ICFP
2010
ACM
13 years 8 months ago
Polyvariant flow analysis with higher-ranked polymorphic types and higher-order effect operators
We present a type and effect system for flow analysis that makes essential use of higher-ranked polymorphism. We show that, for higher-order functions, the expressiveness of highe...
Stefan Holdermans, Jurriaan Hage