In this paper, we describe the use of similarity metrics in a novel visual environment for storing and retrieving favorite web pages. The similarity metrics, called Implicit Queri...
Mary Czerwinski, Susan T. Dumais, George G. Robert...
– We describe a method to extract content text from diverse Web pages by using the HTML document’s Text-to-Tag Ratio rather than specific HTML cues that may not be constant acr...
An unsupervised clustering of the webpages on a website is a primary requirement for most wrapper induction and automated data extraction methods. Since page content can vary dras...
The deep web contains an order of magnitude more information than the surface web, but that information is hidden behind the web forms of a large number of web sites. Metasearch e...
Jeffrey P. Bigham, Ryan S. Kaminsky, Jeffrey Nicho...
When a new technology is introduced, the migration of existing applications to the new technology must be carefully considered. Automation can make some migrations feasible that o...