Sciweavers

GRC
2005
IEEE

Semantic based clustering of Web documents

14 years 6 months ago
Semantic based clustering of Web documents
Abstract. A new methodology that structures the semantics of a collection of documents into the geometry of a simplicial complex is developed. A simplicial complex is topologically equivalent to a polyhedron in Euclidean space. The semantics of documents are structured by the geometry: A primitive concept is represented by a simplex. and a concept is represented by a connected component. Based on these structures, documents can be clustered into some meaningful classes. Experiments with three different data sets from web pages and medical literature have shown that our approach performs significantly better than traditional clustering algorithms, such as k-means, AutoClass and Hierarchical Clustering (HAC). keyword clustering, association(rule)s, topology, simplicial complex, polyhedron
Tsau Young Lin, I-Jen Chiang
Added 24 Jun 2010
Updated 24 Jun 2010
Type Conference
Year 2005
Where GRC
Authors Tsau Young Lin, I-Jen Chiang
Comments (0)