Some previous works show that a web page can be partitioned to multiple segments or blocks, and usually the importance of those blocks in a page is not equivalent. Also, it is pro...
Ruihua Song, Haifeng Liu, Ji-Rong Wen, Wei-Ying Ma
Labeling text data is quite time-consuming but essential for automatic text classification. Especially, manually creating multiple labels for each document may become impractical ...
We present a new approach to large-scale graph mining based on so-called backbone refinement classes. The method efficiently mines tree-shaped subgraph descriptors under minimum f...
Data clustering is an important task in many disciplines. A large number of studies have attempted to improve clustering by using the side information that is often encoded as pai...
Addressed in this paper is the issue of `email data cleaning' for text mining. Many text mining applications need take emails as input. Email data is usually noisy and thus i...