While much of the data on the web is unstructured in nature, there is also a significant amount of embedded structured data, such as product information on e-commerce sites or sto...
This paper studies the problem of extracting data from a Web page that contains several structured data records. The objective is to segment these data records, extract data items...
The semantic web is expected to have an impact at least as big as that of the existing HTML based web, if not greater. However, the challenge lays in creating this semantic web an...
For the manual semantic markup of documents to become widespread, users must be able to express annotations that conform to ontologies (or schemas) that have shared meaning. Howev...
Manually querying search engines in order to accumulate a large body of factual information is a tedious, error-prone process of piecemeal search. Search engines retrieve and rank...
Oren Etzioni, Michael J. Cafarella, Doug Downey, S...