Information extraction (IE) from semi-structured Web documents is a critical issue for information integration systems on the Internet. Previous work in wrapper induction aim to so...
We present in this paper ObjectRunner, a system for extracting, integrating and querying structured data from the Web. Our system harvests real-world items from template-based HTM...
Abstract. In this paper we describe a methodology for harvesting information from large distributed repositories (e.g. large Web sites) with minimum user intervention. The methodol...
Fabio Ciravegna, Sam Chapman, Alexiei Dingli, Yori...
With the increased usage of the Web and its availability of data, various scholarly information is now available on the Web. Extraction, aggregation, and visualization of such inf...
Many Knowledge workers are increasingly using online resources to find out latest developments in their specialty and articles of interest. To extract relevant information from s...