Many scalable data mining tasks rely on active learning to provide the most useful accurately labeled instances. However, what if there are multiple labeling sources (`oracles...
Processing and extracting meaningful knowledge from count data is an important problem in data mining. The volume of data is increasing dramatically as the data is generated by da...
— In a data-mining approach, a model for estimation of Aerosol Optical Depth (AOD) from satellite observations is learned using collocated satellite and groundbased observations....
A data set can be clustered in many ways depending on the clustering algorithm employed, parameter settings used and other factors. Can multiple clusterings be combined so that th...
Alexander P. Topchy, Anil K. Jain, William F. Punc...
This paper explores unexpected results that lie at the intersection of two common themes in the KDD community: large datasets and the goal of building compact models. Experiments ...