We present a document routing and index partitioning scheme for scalable similarity-based search of documents in a large corpus. We consider the case when similarity-based search ...
While classic information retrieval methods return whole documents as a result of a query, many information demands would be better satisfied by fine-grain access inside the docu...
Empirical equations are an important class of regularities that can be discovered in databases. In this paper we concentrate on the role of equations as de nitions of attribute val...
We have been working on two different KDD systems for scientific data. One system involves comparative genomics, where the database contains more than 60,000 plant gene and protei...
Many vertical search tasks such as local search focus on specific domains. The meaning of relevance in these verticals is domain-specific and usually consists of multiple well-d...
Changsung Kang, Xuanhui Wang, Yi Chang, Belle L. T...