PEIR, the Personal Environmental Impact Report, is a participatory sensing application that uses location data sampled from everyday mobile phones to calculate personalized estima...
Min Mun, Sasank Reddy, Katie Shilton, Nathan Yau, ...
Text classification has matured as a research discipline over the last decade. Independently, business intelligence over structured databases has long been a source of insights fo...
Multi-label problems arise in various domains such as multitopic document categorization and protein function prediction. One natural way to deal with such problems is to construc...
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Information-theoretic clustering aims to exploit information theoretic measures as the clustering criteria. A common practice on this topic is so-called INFO-K-means, which perfor...