Expressing web page content in a way that computers can understand is the key to a semantic web. Generating ontological information from the web automatically using machine learni...
We present Opal, a light-weight framework for interactively locating missing web pages (http status code 404). Opal is an example of “in vivo” preservation: harnessing the col...
Time plays important roles in Web search, because most Web pages contain time information and a lot of Web queries are time-related. However, traditional search engines such as Go...
It is often desirable to extract structured information from raw web pages for better information browsing, query answering, and pattern mining. Many such Information Extraction (...
A large and growing number of web pages display contextual advertising based on keywords automatically extracted from the text of the page, and this is a substantial source of rev...