We present Opal, a light-weight framework for interactively locating missing web pages (http status code 404). Opal is an example of “in vivo” preservation: harnessing the col...
We discuss the problem of Web data extraction and describe an XML-based methodology whose goal extends far beyond simple "screen scraping." An ideal data extraction proc...