Text clustering is most commonly treated as a fully automated task without user supervision. However, we can improve clustering performance using supervision in the form of pairwi...
Learning probabilistic graphical models from high-dimensional datasets is a computationally challenging task. In many interesting applications, the domain dimensionality is such a...
The growing demand for large-scale data mining and data analysis applications has led both industry and academia to design new types of highly scalable data-intensive computing pl...
Yingyi Bu, Bill Howe, Magdalena Balazinska, Michae...
In earlier work we have introduced and explored a variety of different probabilistic models for the problem of answering selectivity queries posed to large sparse binary data set...
Naïve Bayes (NB) classifier has long been considered a core methodology in text classification mainly due to its simplicity and computational efficiency. There is an increasing n...