Sciweavers

1038 search results - page 151 / 208
» A Genetic Algorithm for Clustering on Very Large Data Sets
Sort
View
BMCBI
2008
132views more  BMCBI 2008»
13 years 7 months ago
The SeqWord Genome Browser: an online tool for the identification and visualization of atypical regions of bacterial genomes thr
Background: Data mining in large DNA sequences is a major challenge in microbial genomics and bioinformatics. Oligonucleotide usage (OU) patterns provide a wealth of information f...
Hamilton Ganesan, Anna S. Rakitianskaia, Colin F. ...
SOSP
2007
ACM
14 years 4 months ago
Sinfonia: a new paradigm for building scalable distributed systems
We propose a new paradigm for building scalable distributed systems. Our approach does not require dealing with message-passing protocols—a major complication in existing distri...
Marcos Kawazoe Aguilera, Arif Merchant, Mehul A. S...
CSDA
2008
158views more  CSDA 2008»
13 years 7 months ago
Outlier identification in high dimensions
A computationally fast procedure for identifying outliers is presented, that is particularly effective in high dimensions. This algorithm utilizes simple properties of principal c...
Peter Filzmoser, Ricardo A. Maronna, Mark Werner
SDM
2007
SIAM
187views Data Mining» more  SDM 2007»
13 years 9 months ago
Topic Models over Text Streams: A Study of Batch and Online Unsupervised Learning
Topic modeling techniques have widespread use in text data mining applications. Some applications use batch models, which perform clustering on the document collection in aggregat...
Arindam Banerjee, Sugato Basu
BMCBI
2008
129views more  BMCBI 2008»
13 years 7 months ago
EMAAS: An extensible grid-based Rich Internet Application for microarray data analysis and management
Background: Microarray experimentation requires the application of complex analysis methods as well as the use of non-trivial computer technologies to manage the resultant large d...
Geraint Barton, J. C. Abbott, Norie Chiba, D. W. H...