Sciweavers

542 search results - page 89 / 109
» Learning author-topic models from text corpora
Sort
View
CVPR
2009
IEEE
15 years 2 months ago
Contextual Restoration of Severely Degraded Document Images
We propose an approach to restore severely degraded document images using a probabilistic context model. Un- like traditional approaches that use previously learned prior models...
Jyotirmoy Banerjee, Anoop M. Namboodiri, C. V. Jaw...
CIKM
2005
Springer
14 years 1 months ago
Learning to summarise XML documents using content and structure
Documents formatted in eXtensible Markup Language (XML) are becoming increasingly available in collections of various document types. In this paper, we present an approach for the...
Massih-Reza Amini, Anastasios Tombros, Nicolas Usu...
SDM
2010
SIAM
259views Data Mining» more  SDM 2010»
13 years 9 months ago
Semi-supervised Bio-named Entity Recognition with Word-Codebook Learning
We describe a novel semi-supervised method called WordCodebook Learning (WCL), and apply it to the task of bionamed entity recognition (bioNER). Typical bioNER systems can be seen...
Pavel P. Kuksa, Yanjun Qi
SDM
2008
SIAM
139views Data Mining» more  SDM 2008»
13 years 9 months ago
Simultaneous Unsupervised Learning of Disparate Clusterings
Most clustering algorithms produce a single clustering for a given data set even when the data can be clustered naturally in multiple ways. In this paper, we address the difficult...
Prateek Jain, Raghu Meka, Inderjit S. Dhillon
ACL
2006
13 years 9 months ago
Modelling Lexical Redundancy for Machine Translation
Certain distinctions made in the lexicon of one language may be redundant when translating into another language. We quantify redundancy among source types by the similarity of th...
David Talbot, Miles Osborne