Sciweavers

893 search results - page 130 / 179
» A New Performance Evaluation Technique for Web Information R...
Sort
View
KDD
2002
ACM
138views Data Mining» more  KDD 2002»
14 years 9 months ago
Learning to match and cluster large high-dimensional data sets for data integration
Part of the process of data integration is determining which sets of identifiers refer to the same real-world entities. In integrating databases found on the Web or obtained by us...
William W. Cohen, Jacob Richman
SIGKDD
2010
146views more  SIGKDD 2010»
13 years 3 months ago
Latent semantic indexing (LSI) fails for TREC collections
The aim of latent semantic indexing (LSI) is to uncover the relationships between terms, hidden concepts, and documents. LSI uses the matrix factorization technique known as singu...
Avinash Atreya, Charles Elkan
CCR
2008
76views more  CCR 2008»
13 years 9 months ago
WebClass: adding rigor to manual labeling of traffic anomalies
Despite the flurry of anomaly-detection papers in recent years, effective ways to validate and compare proposed solutions have remained elusive. We argue that evaluating anomaly d...
Haakon Ringberg, Augustin Soule, Jennifer Rexford
CIKM
2007
Springer
14 years 3 months ago
Index compression is good, especially for random access
Index compression techniques are known to substantially decrease the storage requirements of a text retrieval system. As a side-effect, they may increase its retrieval performanc...
Stefan Büttcher, Charles L. A. Clarke
RIAO
1997
13 years 10 months ago
An Analysis of Statistical and Syntactic Phrases
As the amount of textual information available through the World Wide Web grows, there is a growing need for high-precision IR systems that enable a user to nd useful information ...
Mandar Mitra, Chris Buckley, Amit Singhal, Claire ...