Relational data streams and XML streams have previously provided two separate research foci, but their unified support by a single Data Stream Management System (DSMS) is very des...
In many recent applications, data are continuously being disseminated from a source to a set of servers. In this paper, we propose a cost-based approach to construct dissemination...
Publication records are often found in the authors' personal home pages. If such a record is partitioned into a list of semantic fields of authors, title, date, etc., the uns...
Wei Zhang, Clement T. Yu, Neil R. Smalheiser, Vetl...
A burst is a large number of events occurring within a certain time window. As an unusual activity, it's a noteworthy phenomenon in many natural and social processes. Many da...
We propose XSEED, a synopsis of path queries for cardinality estimation that is accurate, robust, efficient, and adaptive to memory budgets. XSEED starts from a very small kernel,...
We consider the problem of continuously maintaining order sketches over data streams with a relative rank error guarantee . Novel space-efficient and one-scan randomised technique...
Efficient indexing techniques have been developed for the exact and approximate substructure search in large scale graph databases. Unfortunately, the retrieval problem of structu...
In this paper, we propose a new model for coherent clustering of gene expression data called reg-cluster. The proposed model allows (1) the expression profiles of genes in a clust...
Xin Xu, Ying Lu, Anthony K. H. Tung, Wei Wang 0010
The problem of frequently updating multi-dimensional indexes arises in many location-dependent applications. While the R-tree and its variants are one of the dominant choices for ...