Sciweavers

406 search results - page 31 / 82
» Database-friendly random projections
Sort
View
SIGIR
2006
ACM
14 years 2 months ago
Finding near-duplicate web pages: a large-scale evaluation of algorithms
Broder et al.’s [3] shingling algorithm and Charikar’s [4] random projection based approach are considered “state-of-theart” algorithms for finding near-duplicate web pag...
Monika Rauch Henzinger
PAKDD
2007
ACM
184views Data Mining» more  PAKDD 2007»
14 years 2 months ago
A Fast Algorithm for Finding Correlation Clusters in Noise Data
Abstract. Noise significantly affects cluster quality. Conventional clustering methods hardly detect clusters in a data set containing a large amount of noise. Projected clusterin...
Jiuyong Li, Xiaodi Huang, Clinton Selke, Jianming ...
MCS
2009
Springer
14 years 1 months ago
Random Ordinality Ensembles A Novel Ensemble Method for Multi-valued Categorical Data
Abstract. Data with multi-valued categorical attributes can cause major problems for decision trees. The high branching factor can lead to data fragmentation, where decisions have ...
Amir Ahmad, Gavin Brown
ALENEX
2007
105views Algorithms» more  ALENEX 2007»
13 years 10 months ago
ScrewBox: a Randomized Certifying Graph-Non-Isomorphism Algorithm
We present a novel randomized approach to the graph isomorphism problem. Our algorithm aims at solving difficult instances by producing randomized certificates for non-isomorphis...
Martin Kutz, Pascal Schweitzer
STOC
2006
ACM
83views Algorithms» more  STOC 2006»
14 years 9 months ago
A randomized polynomial-time simplex algorithm for linear programming
We present the first randomized polynomial-time simplex algorithm for linear programming. Like the other known polynomial-time algorithms for linear programming, its running time ...
Jonathan A. Kelner, Daniel A. Spielman