Sciweavers

268 search results - page 29 / 54
» Exploiting the Similarity of Non-Matching Terms at Retrieval...
Sort
View
CIKM
2009
Springer
13 years 11 months ago
Robust record linkage blocking using suffix arrays
Record linkage is an important data integration task that has many practical uses for matching, merging and duplicate removal in large and diverse databases. However, a quadratic ...
Timothy de Vries, Hui Ke, Sanjay Chawla, Peter Chr...
UIST
2009
ACM
14 years 1 months ago
Relaxed selection techniques for querying time-series graphs
Time-series graphs are often used to visualize phenomena that change over time. Common tasks include comparing values at different points in time and searching for specified patte...
Christian Holz, Steven Feiner
CIKM
2011
Springer
12 years 7 months ago
Probabilistic near-duplicate detection using simhash
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Sadhan Sood, Dmitri Loguinov
SIGIR
2010
ACM
13 years 11 months ago
Capturing page freshness for web search
Freshness has been increasingly realized by commercial search engines as an important criteria for measuring the quality of search results. However, most information retrieval met...
Na Dai, Brian D. Davison
CIKM
2010
Springer
13 years 5 months ago
Probabilistic ranking for relational databases based on correlations
This paper proposes a ranking method to exploit statistical correlations among pairs of attribute values in relational databases. For a given query, the correlations of the query ...
Jaehui Park, Sang-goo Lee