In KDD procedure, to fill in missing data typically requires a very large investment of time and energy - often 80% to 90% of a data analysis project is spent in making the data re...
The increasing complexity of enterprise databases and the prevalent lack of documentation incur significant cost in both understanding and integrating the databases. Existing solu...
In order to navigate huge document collections efficiently, tagged hierarchical structures can be used. For users, it is important to correctly interpret tag combinations. In this ...
Search engines use inverted files as index data structures to speed up the solution of user queries. The index is distributed on a set of processors forming a cluster of computer...
We consider the problem of document indexing and representation. Recently, Locality Preserving Indexing (LPI) was proposed for learning a compact document subspace. Different from...
Deng Cai, Xiaofei He, Wei Vivian Zhang, Jiawei Han