Sciweavers

WEBI
2007
Springer

K-SVMeans: A Hybrid Clustering Algorithm for Multi-Type Interrelated Datasets

14 years 5 months ago
K-SVMeans: A Hybrid Clustering Algorithm for Multi-Type Interrelated Datasets
Identification of distinct clusters of documents in text collections has traditionally been addressed by making the assumption that the data instances can only be represented by homogeneous and uniform features. Many real-world data, on the other hand, comprise of multiple types of heterogeneous interrelated components, such as web pages and hyperlinks, online scientific publications and authors and publication venues to name a few. In this paper, we present KSVMeans, a clustering algorithm for multi-type interrelated datasets that integrates the well known K-Means clustering with the highly popular Support Vector Machines. The experimental results on authorship analysis of two real world web-based datasets show that K-SVMeans can successfully discover topical clusters of documents and achieve better clustering solutions than homogeneous data clustering.
Levent Bolelli, Seyda Ertekin, Ding Zhou, C. Lee G
Added 09 Jun 2010
Updated 09 Jun 2010
Type Conference
Year 2007
Where WEBI
Authors Levent Bolelli, Seyda Ertekin, Ding Zhou, C. Lee Giles
Comments (0)