Sciweavers

373 search results - page 39 / 75
» Correcting the Document Layout: A Machine Learning Approach
Sort
View
SIGIR
2003
ACM
14 years 2 months ago
Text categorization by boosting automatically extracted concepts
Term-based representations of documents have found widespread use in information retrieval. However, one of the main shortcomings of such methods is that they largely disregard le...
Lijuan Cai, Thomas Hofmann
ICML
2003
IEEE
14 years 10 months ago
Exploration and Exploitation in Adaptive Filtering Based on Bayesian Active Learning
In the task of adaptive information filtering, a system receives a stream of documents but delivers only those that match a person's information need. As the system filters i...
Yi Zhang, Wei Xu, James P. Callan
EMNLP
2007
13 years 10 months ago
Bootstrapping Information Extraction from Field Books
We present two machine learning approaches to information extraction from semi-structured documents that can be used if no annotated training data are available, but there does ex...
Sander Canisius, Caroline Sporleder
SIGIR
2005
ACM
14 years 2 months ago
Predicting query difficulty on the web by learning visual clues
We describe a method for predicting query difficulty in a precision-oriented web search task. Our approach uses visual features from retrieved surrogate document representations (...
Eric C. Jensen, Steven M. Beitzel, David A. Grossm...
AND
2010
13 years 7 months ago
Reshaping automatic speech transcripts for robust high-level spoken document analysis
High-level spoken document analysis is required in many applications seeking access to the semantic content of audio data, such as information retrieval, machine translation or au...
Julien Fayolle, Fabienne Moreau, Christian Raymond...