K-SVMeans: A Hybrid Clustering Algorithm for Multi-Type Interrelated Datasets

16 years 1 months ago

Download www.cse.psu.edu

Identiﬁcation of distinct clusters of documents in text collections has traditionally been addressed by making the assumption that the data instances can only be represented by homogeneous and uniform features. Many real-world data, on the other hand, comprise of multiple types of heterogeneous interrelated components, such as web pages and hyperlinks, online scientiﬁc publications and authors and publication venues to name a few. In this paper, we present KSVMeans, a clustering algorithm for multi-type interrelated datasets that integrates the well known K-Means clustering with the highly popular Support Vector Machines. The experimental results on authorship analysis of two real world web-based datasets show that K-SVMeans can successfully discover topical clusters of documents and achieve better clustering solutions than homogeneous data clustering.

Levent Bolelli, Seyda Ertekin, Ding Zhou, C. Lee G

Real-time Traffic

Heterogeneous Interrelated Components | Homogeneous Data Clustering | Internet Technology | Multi-type Interrelated Datasets | WEBI 2007 |

claim paper

Post Info
More Details (n/a)

Added	09 Jun 2010
Updated	09 Jun 2010
Type	Conference
Year	2007
Where	WEBI
Authors	Levent Bolelli, Seyda Ertekin, Ding Zhou, C. Lee Giles

Comments (0)

Sciweavers

K-SVMeans: A Hybrid Clustering Algorithm for Multi-Type Interrelated Datasets

Heterogeneous Interrelated Components | Homogeneous Data Clustering | Internet Technology | Multi-type Interrelated Datasets | WEBI 2007 |

Explore & Download

Productivity Tools

Sciweavers