Sciweavers

3430 search results - page 78 / 686
» Language model for IR using collection information
Sort
View
DRR
2010
15 years 6 months ago
Efficient automatic OCR word validation using word partial format derivation and language model
In this paper we present an OCR validation module, implemented for the System for Preservation of Electronic Resources (SPER) developed at the U.S. National Library of Medicine.1 ...
Siyuan Chen, Dharitri Misra, George R. Thoma
KDD
2007
ACM
136views Data Mining» more  KDD 2007»
16 years 4 months ago
Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases
We now have incrementally-grown databases of text documents ranging back for over a decade in areas ranging from personal email, to news-articles and conference proceedings. While...
Benyah Shaparenko, Thorsten Joachims
ICFCA
2007
Springer
15 years 10 months ago
Computing Intensions of Digital Library Collections
We model a Digital Library as a formal context in which objects are documents and attributes are terms describing documents contents. A formal concept is very close to the notion o...
Carlo Meghini, Nicolas Spyratos
CHI
2008
ACM
15 years 6 months ago
Word usage and posting behaviors: modeling blogs with unobtrusive data collection methods
We present a large-scale analysis of the content of weblogs dating back to the release of the Blogger program in 1999. Over one million blogs were analyzed from their conception t...
Adam D. I. Kramer, Kerry Rodden
SIGIR
2008
ACM
15 years 4 months ago
Measuring concept relatedness using language models
Over the years, the notion of concept relatedness has attracted considerable attention. A variety of approaches, based on ontology structure, information content, association, or ...
Dolf Trieschnigg, Edgar Meij, Maarten de Rijke, We...