Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of...
Probabilistic frequent itemset mining in uncertain transaction databases semantically and computationally differs from traditional techniques applied to standard "certain&quo...
The discovery of biclusters, which denote groups of items that show coherent values across a subset of all the transactions in a data set, is an important type of analysis perform...
Gaurav Pandey, Gowtham Atluri, Michael Steinbach, ...
Address standardization is a very challenging task in data cleansing. To provide better customer relationship management and business intelligence for customer-oriented cooperates...
Given a quarter of petabyte click log data, how can we estimate the relevance of each URL for a given query? In this paper, we propose the Bayesian Browsing Model (BBM), a new mod...
Customer preferences for products are drifting over time. Product perception and popularity are constantly changing as new selection emerges. Similarly, customer inclinations are ...
In large social networks, nodes (users, entities) are influenced by others for various reasons. For example, the colleagues have strong influence on one's work, while the fri...
We propose two approximation algorithms for identifying communities in dynamic social networks. Communities are intuitively characterized as "unusually densely knit" sub...
Spatial classification is the task of learning models to predict class labels based on the features of entities as well as the spatial relationships to other entities and their fe...