Sciweavers

373 search results - page 34 / 75
» Correcting the Document Layout: A Machine Learning Approach
Sort
View
DAS
2006
Springer
14 years 22 days ago
A Semi-automatic Adaptive OCR for Digital Libraries
This paper presents a novel approach for designing a semi-automatic adaptive OCR for large document image collections in digital libraries. We describe an interactive system for co...
Sachin Rawat, K. S. Sesh Kumar, Million Meshesha, ...
EUROCOLT
1997
Springer
14 years 18 days ago
Ordinal Mind Change Complexity of Language Identification
The approach of ordinal mind change complexity, introduced by Freivalds and Smith, uses (notations for) constructive ordinals to bound the number of mind changes made by a learnin...
Andris Ambainis, Sanjay Jain, Arun Sharma
ICDM
2006
IEEE
164views Data Mining» more  ICDM 2006»
14 years 3 months ago
Unsupervised Learning of Tree Alignment Models for Information Extraction
We propose an algorithm for extracting fields from HTML search results. The output of the algorithm is a database table– a data structure that better lends itself to high-level...
Philip Zigoris, Damian Eads, Yi Zhang
WWW
2005
ACM
14 years 2 months ago
Finding the boundaries of information resources on the web
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
Pavel Dmitriev, Carl Lagoze, Boris Suchkov
ICML
2005
IEEE
14 years 9 months ago
Multi-way distributional clustering via pairwise interactions
We present a novel unsupervised learning scheme that simultaneously clusters variables of several types (e.g., documents, words and authors) based on pairwise interactions between...
Ron Bekkerman, Ran El-Yaniv, Andrew McCallum