Comprehensible and Accurate Cluster Labels in Text Clustering

15 years 9 months ago

Download www.cs.put.poznan.pl

The purpose of text clustering in information retrieval is to discover groups of semantically related documents. Accurate and comprehensible cluster descriptions (labels) let the user comprehend the collection’s content faster and are essential for various document browsing interfaces. The task of creating descriptive, sensible cluster labels is diﬃcult—typical text clustering algorithms focus on optimizing proximity between documents inside a cluster and rely on keyword representation for describing discovered clusters. In the approach called Description Comes First (DCF) cluster labels are as important as document groups—DCF promotes machine discovery of comprehensible candidate cluster labels later used to discover related document groups. In this paper we describe an application of DCF to the k-Means algorithm, including results of experiments performed on the 20-newsgroups document collection. Experimental evaluation showed that DCF does not decrease the metrics used to a...

Jerzy Stefanowski, Dawid Weiss

Real-time Traffic

Cluster Labels | Document | Information Technology | RIAO 2007 | Sensible Cluster Labels |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2007
Where	RIAO
Authors	Jerzy Stefanowski, Dawid Weiss

Comments (0)

Sciweavers

Comprehensible and Accurate Cluster Labels in Text Clustering

Cluster Labels | Document | Information Technology | RIAO 2007 | Sensible Cluster Labels |

Explore & Download

Productivity Tools

Sciweavers