Sciweavers

1083 search results - page 12 / 217
» Efficient Discovery of Confounders in Large Data Sets
Sort
View
BMCBI
2011
12 years 11 months ago
PhyloMap: an algorithm for visualizing relationships of large sequence data sets and its application to the influenza A virus ge
Background: Results of phylogenetic analysis are often visualized as phylogenetic trees. Such a tree can typically only include up to a few hundred sequences. When more than a few...
Jiajie Zhang, Amir Madany Mamlouk, Thomas Martinet...
KDD
2001
ACM
253views Data Mining» more  KDD 2001»
14 years 8 months ago
GESS: a scalable similarity-join algorithm for mining large data sets in high dimensional spaces
The similarity join is an important operation for mining high-dimensional feature spaces. Given two data sets, the similarity join computes all tuples (x, y) that are within a dis...
Jens-Peter Dittrich, Bernhard Seeger
CGA
1999
13 years 7 months ago
Visualizing Large Telecommunication Data Sets
displays to abstract network data and let users interactwithit.Wehaveimplementedafull-scaleSwift3D prototype, which generated the examples we present here. Swift-3D We developed Sw...
Eleftherios Koutsofios, Stephen C. North, Daniel A...
KDD
2000
ACM
149views Data Mining» more  KDD 2000»
13 years 11 months ago
Efficient clustering of high-dimensional data sets with application to reference matching
Many important problems involve clustering large datasets. Although naive implementations of clustering are computationally expensive, there are established efficient techniques f...
Andrew McCallum, Kamal Nigam, Lyle H. Ungar
RECOMB
2006
Springer
14 years 8 months ago
Efficient Enumeration of Phylogenetically Informative Substrings
We study the problem of enumerating substrings that are common amongst genomes that share evolutionary descent. For example, one might want to enumerate all identical (therefore co...
Stanislav Angelov, Boulos Harb, Sampath Kannan, Sa...