In this paper, we consider the problem of extracting structured data from web pages taking into account both the content of individual attributes as well as the structure of pages...
Abstract. When you search for information regarding a particular person on the web, a search engine returns many pages. Some of these pages may be for people with the same name. Ho...
To access data sources on the Web, a crucial step is wrapping, which translates query responses, rendered in textual HTML, back into their relational form. Traditionally, this pro...
Shui-Lung Chuang, Kevin Chen-Chuan Chang, ChengXia...
We describe an infrastructure for the collection and management of large amounts of text, and discuss the possibility of information extraction and visualisation from text corpora...
We describe DEIMOS, a system that automatically discovers and models new sources of information. The system exploits four core technologies developed by our group that makes an en...