Sciweavers

969 search results - page 54 / 194
» Scalable data dissemination using hybrid methods
Sort
View
SIGKDD
2000
95views more  SIGKDD 2000»
13 years 7 months ago
Scalability for Clustering Algorithms Revisited
This paper presents a simple new algorithm that performs k-means clustering in one scan of a dataset, while using a bu er for points from the dataset of xed size. Experiments show...
Fredrik Farnstrom, James Lewis, Charles Elkan
PVLDB
2010
200views more  PVLDB 2010»
13 years 6 months ago
SAPPER: Subgraph Indexing and Approximate Matching in Large Graphs
With the emergence of new applications, e.g., computational biology, new software engineering techniques, social networks, etc., more data is in the form of graphs. Locating occur...
Shijie Zhang, Jiong Yang, Wei Jin
KDD
2002
ACM
138views Data Mining» more  KDD 2002»
14 years 8 months ago
Learning to match and cluster large high-dimensional data sets for data integration
Part of the process of data integration is determining which sets of identifiers refer to the same real-world entities. In integrating databases found on the Web or obtained by us...
William W. Cohen, Jacob Richman
BMCBI
2007
173views more  BMCBI 2007»
13 years 8 months ago
Ringo - an R/Bioconductor package for analyzing ChIP-chip readouts
Background: Chromatin immunoprecipitation combined with DNA microarrays (ChIP-chip) is a high-throughput assay for DNA-protein-binding or post-translational chromatin/histone modi...
Joern Toedling, Oleg Sklyar, Tammo Krueger, Jenny ...
ICML
2001
IEEE
14 years 8 months ago
Feature selection for high-dimensional genomic microarray data
We report on the successful application of feature selection methods to a classification problem in molecular biology involving only 72 data points in a 7130 dimensional space. Ou...
Eric P. Xing, Michael I. Jordan, Richard M. Karp