A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
In this paper, we present an information-theoretic approach to learning a Mahalanobis distance function. We formulate the problem as that of minimizing the differential relative e...
Jason V. Davis, Brian Kulis, Prateek Jain, Suvrit ...
Conditional random fields (CRFs) are graphical models for modeling the probability of labels given the observations. They have traditionally been trained with using a set of obser...
Xinhua Zhang, Douglas Aberdeen, S. V. N. Vishwanat...
We develop a semi-supervised learning method that constrains the posterior distribution of latent variables under a generative model to satisfy a rich set of feature expectation c...
Sunspots are the subject of interest to many astronomers and solar physicists. Sunspot observation, analysis and classification form an important part of furthering the knowledge a...
Trung Thanh Nguyen, Claire P. Willis, Derek J. Pad...