Sciweavers

1443 search results - page 41 / 289
» Similarity Measures for Categorical Data: A Comparative Eval...
Sort
View
EDBT
2009
ACM
173views Database» more  EDBT 2009»
14 years 7 days ago
PROUD: a probabilistic approach to processing similarity queries over uncertain data streams
We present PROUD - A PRObabilistic approach to processing similarity queries over Uncertain Data streams, where the data streams here are mainly time series streams. In contrast t...
Mi-Yen Yeh, Kun-Lung Wu, Philip S. Yu, Ming-Syan C...
VLDB
2002
ACM
137views Database» more  VLDB 2002»
13 years 7 months ago
Comparing Data Streams Using Hamming Norms (How to Zero In)
Massive data streams are now fundamental to many data processing applications. For example, Internet routers produce large scale diagnostic data streams. Such streams are rarely s...
Graham Cormode, Mayur Datar, Piotr Indyk, S. Muthu...
BMCBI
2010
190views more  BMCBI 2010»
13 years 7 months ago
Sample size and statistical power considerations in high-dimensionality data settings: a comparative study of classification alg
Background: Data generated using `omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of...
Yu Guo, Armin Graber, Robert N. McBurney, Raji Bal...
EMNLP
2007
13 years 9 months ago
V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure
We present V-measure, an external entropybased cluster evaluation measure. Vmeasure provides an elegant solution to many problems that affect previously defined cluster evaluatio...
Andrew Rosenberg, Julia Hirschberg
ICDE
2002
IEEE
206views Database» more  ICDE 2002»
14 years 9 months ago
Exploiting Local Similarity for Indexing Paths in Graph-Structured Data
XML and other semi-structured data may have partially specified or missing schema information, motivating the use of a structural summary which can be automatically computed from ...
Raghav Kaushik, Pradeep Shenoy, Philip Bohannon, E...