Motivation: Next-generation sequencing methods are generating increasingly massive datasets, yet still do not fully capture genetic diversity in the richest environments. To under...
Alex L. B. Leach, James P. J. Chong, Kelly R. Rede...
Random sampling is an appealing approach to build synopses of large data streams because random samples can be used for a broad spectrum of analytical tasks. Users are often inter...
Federated queries are regular relational queries accessing data on one or more remote relational or non-relational data sources, possibly combining them with tables stored in the ...
Stephan Ewen, Holger Kache, Volker Markl, Vijaysha...
Large-scale distributed systems have dense, complex code-bases that are assumed to perform multiple and inter-dependent tasks while user interaction is present. The way users inte...
Angelos Stavrou, Gabriela F. Cretu-Ciocarlie, Mich...
This paper offers a local distributed algorithm for expectation maximization in large peer-to-peer environments. The algorithm can be used for a variety of well-known data mining...