Sciweavers

911 search results - page 130 / 183
» On the Distance of Databases
Sort
View
KDD
2006
ACM
122views Data Mining» more  KDD 2006»
14 years 9 months ago
Measuring and extracting proximity in networks
Measuring distance or some other form of proximity between objects is a standard data mining tool. Connection subgraphs were recently proposed as a way to demonstrate proximity be...
Yehuda Koren, Stephen C. North, Chris Volinsky
KDD
2003
ACM
156views Data Mining» more  KDD 2003»
14 years 9 months ago
Mining distance-based outliers in near linear time with randomization and a simple pruning rule
Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal o...
Stephen D. Bay, Mark Schwabacher
VLDB
2007
ACM
159views Database» more  VLDB 2007»
14 years 9 months ago
Example-driven design of efficient record matching queries
Record matching is the task of identifying records that match the same real world entity. This is a problem of great significance for a variety of business intelligence applicatio...
Surajit Chaudhuri, Bee-Chung Chen, Venkatesh Ganti...
VLDB
2007
ACM
107views Database» more  VLDB 2007»
14 years 9 months ago
VGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams
Many applications need to solve the following problem of approximate string matching: from a collection of strings, how to find those similar to a given string, or the strings in ...
Chen Li, Bin Wang, Xiaochun Yang
EDBT
2004
ACM
119views Database» more  EDBT 2004»
14 years 8 months ago
NNH: Improving Performance of Nearest-Neighbor Searches Using Histograms
Efficient search for nearest neighbors (NN) is a fundamental problem arising in a large variety of applications of vast practical interest. In this paper we propose a novel techniq...
Liang Jin, Nick Koudas, Chen Li