Sciweavers

ISICA
2007
Springer

Instant Message Clustering Based on Extended Vector Space Model

14 years 5 months ago
Instant Message Clustering Based on Extended Vector Space Model
Instant intercommunion techniques such as Instant Messaging (IM) are widely popularized. Aiming at such kind of large scale masscommunication media, clustering on its text content is a practical method to analyze the characteristic of text content in instant messages, and find or track the social hot topics. However, key words in one instant message usually are few, even latent; moreover, single message can not describe the conversational context. This is very different from general document and makes common clustering algorithms unsuitable. A novel method called WR-KMeans is proposed, which synthesizes related instant messages as a conversation and enriches conversation’s vector by words which are not included in this conversation but are closely related with existing words in this conversation. WR-KMeans performs clustering like k-means on this extended vector space of conversations. Experiments on the public datasets show that WR-KMeans outperforms the traditional k-means an...
Le Wang, Yan Jia, Weihong Han
Added 08 Jun 2010
Updated 08 Jun 2010
Type Conference
Year 2007
Where ISICA
Authors Le Wang, Yan Jia, Weihong Han
Comments (0)