Mining cluster evolution from multiple correlated time-varying text corpora is important in exploratory text analytics. In this paper, we propose an approach called evolutionary h...
Memory leaks are caused by software programs that prevent the reclamation of memory that is no longer in use. They can cause significant slowdowns, exhaustion of available storag...
Existing cost-sensitive learning methods work with unequal misclassification cost that is given by domain knowledge and appears as precise values. In many real-world applications,...
An ensemble is a set of learned models that make decisions collectively. Although an ensemble is usually more accurate than a single learner, existing ensemble methods often tend ...
In this paper we introduce a modular, highly flexible, opensource environment for data generation. Using an existing graphical data flow tool, the user can combine various types...
At Google, experimentation is practically a mantra; we evaluate almost every change that potentially affects what our users experience. Such changes include not only obvious user-...
Diane Tang, Ashish Agarwal, Deirdre O'Brien, Mike ...
We study the problem of finding the k most frequent items in a stream of items for the recently proposed max-frequency measure. Based on the properties of an item, the maxfrequen...