Most similarity search techniques map the data objects into some high-dimensional feature space. The similarity search then corresponds to a nearest-neighbor search in the feature...
In spite of the great progress in the data mining field in recent years, the problem of missing and uncertain data has remained a great challenge for data mining algorithms. Many ...
New biological experimental techniques are continuing to generate large amounts of data using DNA, RNA, human genome and protein sequences. The quantity and quality of data from t...
Sampling is a widely used technique to increase efficiency in database and data mining applications operating on large dataset. In this paper we present a scalable sampling imple...
Extracting dense sub-components from graphs efficiently is an important objective in a wide range of application domains ranging from social network analysis to biological network...
Nan Wang, Srinivasan Parthasarathy, Kian-Lee Tan, ...