We present novel semi-supervised boosting algorithms that incrementally build linear combinations of weak classifiers through generic functional gradient descent using both labele...
Online forums represent one type of social media that is particularly rich for studying human behavior in information seeking and diffusing. The way users join communities is a re...
Motivation: Data centers are a critical component of modern IT infrastructure but are also among the worst environmental offenders through their increasing energy usage and the re...
Debprakash Patnaik, Manish Marwah, Ratnesh K. Shar...
Correlated or discriminative pattern mining is concerned with finding the highest scoring patterns w.r.t. a correlation measure (such as information gain). By reinterpreting corre...
Classifying nodes in networks is a task with a wide range of applications. It can be particularly useful in anomaly and fraud detection. Many resources are invested in the task of...
Mary McGlohon, Stephen Bay, Markus G. Anderle, Dav...
Recently many data types arising from data mining and Web search applications can be modeled as bipartite graphs. Examples include queries and URLs in query logs, and authors and ...
One common predictive modeling challenge occurs in text mining problems is that the training data and the operational (testing) data are drawn from different underlying distributi...
Spectral clustering refers to a flexible class of clustering procedures that can produce high-quality clusterings on small data sets but which has limited applicability to large-s...
In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two ...
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, ...