We present PROUD - A PRObabilistic approach to processing similarity queries over Uncertain Data streams, where the data streams here are mainly time series streams. In contrast t...
Mi-Yen Yeh, Kun-Lung Wu, Philip S. Yu, Ming-Syan C...
Massive data streams are now fundamental to many data processing applications. For example, Internet routers produce large scale diagnostic data streams. Such streams are rarely s...
Graham Cormode, Mayur Datar, Piotr Indyk, S. Muthu...
Background: Data generated using `omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of...
Yu Guo, Armin Graber, Robert N. McBurney, Raji Bal...
We present V-measure, an external entropybased cluster evaluation measure. Vmeasure provides an elegant solution to many problems that affect previously defined cluster evaluatio...
XML and other semi-structured data may have partially specified or missing schema information, motivating the use of a structural summary which can be automatically computed from ...
Raghav Kaushik, Pradeep Shenoy, Philip Bohannon, E...