We consider the coverage testing problem where we are given a document and a corpus with a limited query interface and asked to find if the corpus contains a near-duplicate of th...
Ali Dasdan, Paolo D'Alberto, Santanu Kolay, Chris ...
The poster describes a fast, simple, yet accurate method to associate large amounts of web resources stored in a search engine database with geographic locations. The method uses ...
We describe our experimental rhetoric engine Vox Populi that generates biased video-sequences from a repository of video interviews and other related audio-visual web sources. Use...
An important problem in search engine advertising is keyword1 generation. In the past, advertisers have preferred to bid for keywords that tend to have high search volumes and hen...
Abstract. The traditional Web news article contents extraction methods are time-costly and need much maintenance because they analyze the layout of news pages to generate the wrapp...