We present a divide-and-merge methodology for clustering a set of objects that combines a top-down "divide" phase with a bottom-up "merge" phase. In contrast, ...
David Cheng, Santosh Vempala, Ravi Kannan, Grant W...
This paper discusses generating document structure from annotated media repositories in a domain-independent manner. This approaches the vision of a universal RDF browser. We star...
Lloyd Rutledge, Jacco van Ossenbruggen, Lynda Hard...
This paper expands on a 1997 study of the amount and distribution of near-duplicate pages on the World Wide Web. We downloaded a set of 150 million web pages on a weekly basis ove...
imately 20 million abstracts. Browsing such a huge repository to find relevant information as well as providing elearning service requires new generation of interfaces. Methods suc...
Mohammed Yeasin, Bhanu Vanteru, Jahangheer S. Shai...
Web image search is inspired by text search techniques; it mainly relies on indexing textual data that surround the image file. But retrieval results are often noisy and image pro...