Biological data, such as gene expression profiles or protein sequences, is often organized in a hierarchy of classes, where the instances assigned to "nearby" classes in...
Background: Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson corre...
Jianchao Yao, Chunqi Chang, Mari L. Salmi, Yeung S...
Streaming user-generated content in the form of blogs, microblogs, forums, and multimedia sharing sites, provides a rich source of data from which invaluable information and insig...
In this paper, we present Spade - the System S declarative stream processing engine. System S is a large-scale, distributed data stream processing middleware under development at ...
Today’s one-pass analytics applications tend to be data-intensive in nature and require the ability to process high volumes of data efficiently. MapReduce is a popular programm...
Boduo Li, Edward Mazur, Yanlei Diao, Andrew McGreg...