Sciweavers

237 search results - page 40 / 48
» Development of Multi-Criteria Metrics for Evaluation of Data...
Sort
View
KDD
2009
ACM
243views Data Mining» more  KDD 2009»
14 years 9 months ago
Exploiting Wikipedia as external knowledge for document clustering
In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...
WWW
2005
ACM
14 years 9 months ago
Duplicate detection in click streams
We consider the problem of finding duplicates in data streams. Duplicate detection in data streams is utilized in various applications including fraud detection. We develop a solu...
Ahmed Metwally, Divyakant Agrawal, Amr El Abbadi
ICDM
2008
IEEE
156views Data Mining» more  ICDM 2008»
14 years 3 months ago
Exploiting Local and Global Invariants for the Management of Large Scale Information Systems
This paper presents a data oriented approach to modeling the complex computing systems, in which an ensemble of correlation models are discovered to represent the system status. I...
Haifeng Chen, Haibin Cheng, Guofei Jiang, Kenji Yo...
PPOPP
2010
ACM
14 years 6 months ago
A distributed placement service for graph-structured and tree-structured data
Effective data placement strategies can enhance the performance of data-intensive applications implemented on high end computing clusters. Such strategies can have a significant i...
Gregory Buehrer, Srinivasan Parthasarathy, Shirish...
KDD
2009
ACM
159views Data Mining» more  KDD 2009»
14 years 9 months ago
Adapting the right measures for K-means clustering
Clustering validation is a long standing challenge in the clustering literature. While many validation measures have been developed for evaluating the performance of clustering al...
Junjie Wu, Hui Xiong, Jian Chen