A critical problem in developing information agents for the Web is accessing data that is formatted for human use. We have developed a set of tools for extracting data from web si...
Craig A. Knoblock, Kristina Lerman, Steven Minton,...
Georeferenced information is growing every day, and geographical information systems are becoming crucial in many decision processes. As a consequence, extracting knowledge from G...
This paper presents PDF-TREX, an heuristic approach for table recognition and extraction from PDF documents. The heuristics starts from an initial set of basic content elements an...
— In this paper, we propose a new architecture that can extract information of numerous objects in an image at highspeed. Various characteristics can be obtained from the image m...
This paper proposes a new region extraction algorithm based on cellular automaton operation, which only utilizes the region boundary information of the image. A simple pixel circu...