In this paper we present an efficient, scalable and general algorithm for performing set joins on predicates involving various similarity measures like intersect size, Jaccard-coe...
This paper presents an innovative partitionbased time join strategy for temporal databases where time is represented by time intervals. The proposed method maps time intervals to ...
Cloud enabled systems have become a crucial component to efficiently process and analyze massive amounts of data. One of the key data processing and analysis operations is the Sim...
To achieve high reliability and scalability, most large-scale data warehouse systems have adopted the cluster-based architecture. In this paper, we propose the design of a new clu...
Yuting Lin, Divyakant Agrawal, Chun Chen, Beng Chi...
Label stream partition is a useful technique to reduce the input I/O cost of holistic twig join by pruning useless streams beforehand. The Prefix Path Stream (PPS) partition scheme...