In this paper we present a fast and efficient match algorithm, which consists of two key techniques: Spectral Correlation Based Feature Merge(SCBFM) and Two-Step Retrieval(TSR). ...
In this paper we focus on high dimensional data sets for which the number of dimensions is an order of magnitude higher than the number of objects. From a classifier design standp...
A major challenge in document clustering is the extremely high dimensionality. For example, the vocabulary for a document set can easily be thousands of words. On the other hand, ...
Abstract- Data streams of real numbers are generated naturally in many applications. The technology of online subsequence searching in data streams becomes more and more important ...
Abstract-- Feature selection is an important method for improving the efficiency and accuracy of text categorization algorithms by removing redundant and irrelevant terms from the ...