Abstract. Extracting data from web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interests. In this paper, we propose a...
We present a new approach to extracting information from unstructured documents based on an application ontology that describes a domain of interest. Starting with such an ontolog...
David W. Embley, Douglas M. Campbell, Randy D. Smi...
Existing methods of information extraction from HTML documents include manual approach, supervised learning and automatic techniques. The manual method has high precision and reca...
Mirel Cosulschi, Adrian Giurca, Bogdan Udrescu, Ni...
This paper deals with studies the problem of identification and extraction of flat and nested data records from a given web page. With the explosive growth of information sources ...