Abstract— Analyzing unknown data sets such as multispectral images often requires unsupervised techniques. Data clustering is a well known and widely used approach in such cases....
In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...
Background: A huge amount of biomedical textual information has been produced and collected in MEDLINE for decades. In order to easily utilize biomedical information in the free t...
An approach to simultaneous document classification and word clustering is developed using a two-way mixture model of Poisson distributions. Each document is represented by a vect...
Wikipedia has been applied as a background knowledge base to various text mining problems, but very few attempts have been made to utilize it for document clustering. In this pape...
Anna Huang, David N. Milne, Eibe Frank, Ian H. Wit...