Sciweavers

373 search results - page 38 / 75
» Correcting the Document Layout: A Machine Learning Approach
Sort
View
HT
2005
ACM
14 years 2 months ago
As we may perceive: inferring logical documents from hypertext
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
Pavel Dmitriev, Carl Lagoze, Boris Suchkov
SPLST
2003
13 years 10 months ago
Compacting XML Documents
Abstract. Nowadays one of the most common formats for storing information is XML. The size of XML documents can be rather large, and they may contain redundant attributes which can...
Miklós Kálmán, Ferenc Havasi,...
SIGIR
2006
ACM
14 years 3 months ago
LDA-based document models for ad-hoc retrieval
Search algorithms incorporating some form of topic model have a long history in information retrieval. For example, cluster-based retrieval has been studied since the 60s and has ...
Xing Wei, W. Bruce Croft
EMNLP
2010
13 years 7 months ago
NLP on Spoken Documents Without ASR
There is considerable interest in interdisciplinary combinations of automatic speech recognition (ASR), machine learning, natural language processing, text classification and info...
Mark Dredze, Aren Jansen, Glen Coppersmith, Ken Wa...
ECML
2005
Springer
14 years 2 months ago
Error-Sensitive Grading for Model Combination
Abstract. Ensemble learning is a powerful learning approach that combines multiple classifiers to improve prediction accuracy. An important decision while using an ensemble of cla...
Surendra K. Singhi, Huan Liu