Sciweavers

373 search results - page 38 / 75
» Correcting the Document Layout: A Machine Learning Approach
Sort
View
134
Voted
HT
2005
ACM
15 years 9 months ago
As we may perceive: inferring logical documents from hypertext
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
Pavel Dmitriev, Carl Lagoze, Boris Suchkov
128
Voted
SPLST
2003
15 years 4 months ago
Compacting XML Documents
Abstract. Nowadays one of the most common formats for storing information is XML. The size of XML documents can be rather large, and they may contain redundant attributes which can...
Miklós Kálmán, Ferenc Havasi,...
SIGIR
2006
ACM
15 years 9 months ago
LDA-based document models for ad-hoc retrieval
Search algorithms incorporating some form of topic model have a long history in information retrieval. For example, cluster-based retrieval has been studied since the 60s and has ...
Xing Wei, W. Bruce Croft
120
Voted
EMNLP
2010
15 years 1 months ago
NLP on Spoken Documents Without ASR
There is considerable interest in interdisciplinary combinations of automatic speech recognition (ASR), machine learning, natural language processing, text classification and info...
Mark Dredze, Aren Jansen, Glen Coppersmith, Ken Wa...
113
Voted
ECML
2005
Springer
15 years 9 months ago
Error-Sensitive Grading for Model Combination
Abstract. Ensemble learning is a powerful learning approach that combines multiple classifiers to improve prediction accuracy. An important decision while using an ensemble of cla...
Surendra K. Singhi, Huan Liu