Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

184

CIKM
2000
Springer

157views Information Technology» more CIKM 2000»

A Semi-Supervised Document Clustering Technique for Information Organization

15 years 11 months ago

A Semi-Supervised Document Clustering Technique for Information Organization

Download ids.snu.ac.kr

This paper discusses a new type of semi-supervised document clustering that uses partial supervision to partition a large set of documents. Most clustering methods organizes documents into groups based only on similarity measures. Unfortunately, the traditional approaches to document clustering are often unable to correctly discern structural details hidden within the document corpus because their algorithms inherently strongly depend on the document themselves and their similarity to each other. In this paper, we attempt to isolate more semantically coherent clusters by employing the domain-speciﬁc knowledge provided by a document analyst. By using external human knowledge to guide the clustering mechanism with some ﬂexibility when creating the clusters, clustering eﬃciency can be considerably enhanced. As a basic clustering strategy, we use a variant of complete-linkage agglomerative hierarchical clustering, and develop the concepts (or seeds) of requested clusters by exploiti...

Han-joon Kim, Sang-goo Lee

Real-time Traffic

Agglomerative Hierarchical Clustering | CIKM 2000 | Clustering Method | Document Clustering | Information Management |

claim paper

Related Content

» A SemiSupervised Document Clustering Algorithm Based on EM

» Aggregating Multiple Instances in Relational Database Using SemiSupervised Genetic Algorit...

» Multilabel ASRS Dataset Classification Using Semi Supervised Subspace Clustering

» Semi Supervised Spectral Clustering for Regulatory Module Discovery

» SemiSupervised Ensemble Ranking

» Biomarker Discovery Across Annotated and Unannotated Microarray Datasets Using SemiSupervi...

» Using Self Organizing Map to Cluster Arabic Crime Documents

» SONIA A Service for Organizing Networked Information Autonomously

» Combining preference and contentbased approaches for improving document clustering effecti...

Post Info
More Details (n/a)

Added	02 Aug 2010
Updated	02 Aug 2010
Type	Conference
Year	2000
Where	CIKM
Authors	Han-joon Kim, Sang-goo Lee

Comments (0)