Improving the Clustering of Blogosphere with a Self-term Enriching Technique

16 years 2 months ago

Download users.dsic.upv.es

The analysis of blogs is emerging as an exciting new area in the text processing field which attempts to harness and exploit the vast quantity of information being published by individuals. However, their particular characteristics (shortness, vocabulary size and nature, etc.) make it difficult to achieve good results using automated clustering techniques. Moreover, the fact that many blogs may be considered to be narrow domain means that exploiting external linguistic resources can have limited value. In this paper, we present a methodology to improve the performance of clustering techniques on blogs, which does not rely on external resources. Our results show that this technique can produce significant improvements in the quality of clusters produced.

Fernando Perez-Tellez, David Pinto, John Cardiff,

Real-time Traffic

Automated Clustering Techniques | Clustering Techniques | Signal Processing | Text Processing Field | TSD 2009 |

claim paper

Post Info
More Details (n/a)

Added	27 May 2010
Updated	27 May 2010
Type	Conference
Year	2009
Where	TSD
Authors	Fernando Perez-Tellez, David Pinto, John Cardiff, Paolo Rosso

Comments (0)

Sciweavers

Improving the Clustering of Blogosphere with a Self-term Enriching Technique

Automated Clustering Techniques | Clustering Techniques | Signal Processing | Text Processing Field | TSD 2009 |

Explore & Download

Productivity Tools

Sciweavers