Internet is a huge source of information. Search engines have indexed much of this information and are able to extract the relevant webpages that are related to a given query. Howe...
In this paper, we report the development and experiments of IBM Content Harvester (CH), a tool to analyze and recover templates and content from word processor created text docume...
In this work we compare different techniques to automatically find candidate web pages to substitute broken links. We extract information from the anchor text, the content of the p...
Background: In the last five years large online resources of human variability have appeared, notably HapMap, Perlegen and the CEPH foundation. These databases of genotypes with p...
Jorge Amigo, Antonio Salas, Christopher Phillips, ...
Feature aggregation is a critical technique in content-based image retrieval systems that employ multiple visual features to characterize image content. One problem in feature aggr...