Taxonomies of the Web typically have hundreds of thousands of categories and skewed category distribution over documents. It is not clear whether existing text classification tech...
Tie-Yan Liu, Yiming Yang, Hao Wan, Qian Zhou, Bin ...
We focus on the problem of finding patterns across two large, multidimensional datasets. For example, given feature vectors of healthy and of non-healthy patients, we want to answ...
Agma J. M. Traina, Caetano Traina Jr., Spiros Papa...
We develop an iterative relaxation algorithm, called RIBRA, for NMR protein backbone assignment. RIBRA applies nearest neighbor and weighted maximum independent set algorithms to ...
A common problem in many types of databases is retrieving the most similar matches to a query object. Finding those matches in a large database can be too slow to be practical, es...
Index trees created using distance based indexing are difficult to maintain online since the distance function involved is often costly to compute. This problem is intensified whe...