Massive data streams are now fundamental to many data processing applications. For example, Internet routers produce large scale diagnostic data streams. Such streams are rarely s...
Graham Cormode, Mayur Datar, Piotr Indyk, S. Muthu...
In this study we propose sketching algorithms for computing similarities between hierarchical data. Specifically, we look at data objects that are represented using leaf-labeled t...
Tasks of data mining and information retrieval depend on a good distance function for measuring similarity between data instances. The most effective distance function must be for...
Roget’s Thesaurus has not been sufficiently appreciated in Natural Language Processing. We show that Roget's and WordNet are birds of a feather. In a few typical tests, we ...
We proposed a new approach to compare profiles when the correlations among attributes can be represented as a tree. To account for these correlations, the profile is extended with...