This paper presents a novel approach for designing a semi-automatic adaptive OCR for large document image collections in digital libraries. We describe an interactive system for co...
Sachin Rawat, K. S. Sesh Kumar, Million Meshesha, ...
The approach of ordinal mind change complexity, introduced by Freivalds and Smith, uses (notations for) constructive ordinals to bound the number of mind changes made by a learnin...
We propose an algorithm for extracting fields from HTML search results. The output of the algorithm is a database table– a data structure that better lends itself to high-level...
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
We present a novel unsupervised learning scheme that simultaneously clusters variables of several types (e.g., documents, words and authors) based on pairwise interactions between...