We consider the coverage testing problem where we are given a document and a corpus with a limited query interface and asked to find if the corpus contains a near-duplicate of th...
Ali Dasdan, Paolo D'Alberto, Santanu Kolay, Chris ...
Features in many real world applications such as Cheminformatics, Bioinformatics and Information Retrieval have complex internal structure. For example, frequent patterns mined fr...
In this paper we introduce the Generalized Bayesian Committee Machine (GBCM) for applications with large data sets. In particular, the GBCM can be used in the context of kernel ba...
The widespread adoption of the Resource Description Framework (RDF) for the representation of both open web and enterprise data is the driving force behind the increasing research...
Multimedia similarity search has been playing a critical role in many novel applications. Typically, multimedia objects are described by high-dimensional feature vectors (or point...
Zi Huang, Heng Tao Shen, Jiajun Liu, Xiaofang Zhou