The scalability problem in data mining involves the development of methods for handling large databases with limited computational resources. In this paper, we present a two-phase...
Sampling is a widely used technique to increase efficiency in database and data mining applications operating on large dataset. In this paper we present a scalable sampling imple...
Extracting entities (such as people, movies) from documents and identifying the categories (such as painter, writer) they belong to enable structured querying and data analysis ov...
Discovering coherent gene expression patterns in time-series gene expression data is an important task in bioinformatics research and biomedical applications. In this paper, we pr...
To our best knowledge, all existing graph pattern mining algorithms can only mine either closed, maximal or the complete set of frequent subgraphs instead of graph generators whic...
Zhiping Zeng, Jianyong Wang, Jun Zhang, Lizhu Zhou