Maximum margin clustering (MMC) is a recent large margin unsupervised learning approach that has often outperformed conventional clustering methods. Computationally, it involves n...
We present a technique for augmenting annotated training data with hierarchical word clusters that are automatically derived from a large unannotated corpus. Cluster membership is...
Queries issued by casual users or specialists exploring a data set often point us to important subsets of the data, be it clusters, outliers or other features of particular import...
Training a good text detector requires a large amount of labeled data, which can be very expensive to obtain. Cotraining has been shown to be a powerful semi-supervised learning t...
We develop data structures for dynamic closest pair problems with arbitrary (not necessarily geometric) distance functions, based on a technique previously used by the author for ...