We introduce a new method for data clustering based on a particular Gaussian mixture model (GMM). Each cluster of data, modeled as a GMM into an input space, is interpreted as a hy...
We present an approach to document clustering based on winnowing fingerprints that achieved good values of effectiveness with considerable save in memory space and computation tim...
Cluster label quality is crucial for browsing topic hierarchies obtained via document clustering. Intuitively, the hierarchical structure should influence the labeling accuracy. H...
Users of Web search engines are often forced to sift through the long ordered list of document “snippets” returned by the engines. The IR community has explored document cluste...
In multi-instance learning, the training examples are bags composed of instances without labels and the task is to predict the labels of unseen bags through analyzing the training...