—Many companies now routinely run massive data analysis jobs – expressed in some scripting language – on large clusters of low-end servers. Many analysis scripts are complex ...
—Fuzzy/similarity joins have been widely studied in the research community and extensively used in real-world applications. This paper proposes and evaluates several algorithms f...
Foto N. Afrati, Anish Das Sarma, David Menestrina,...
— In this paper, we focus on efficient keyword query processing for XML data based on SLCA and ELCA semantics. We propose for each keyword a novel form of inverted list, which i...
Junfeng Zhou, Zhifeng Bao, Wei Wang, Tok Wang Ling...
— The web today is increasingly characterized by social and real-time signals, which we believe represent two frontiers in information retrieval. In this paper, we present Earlyb...
Michael Busch, Krishna Gade, Brian Larson, Patrick...
—A facility for merging equivalent data streams can support multiple capabilities in a data stream management system (DSMS), such as query-plan switching and high availability. O...
Badrish Chandramouli, David Maier, Jonathan Goldst...