Finding structure in multiple streams of data is an important problem. Consider the streams of data owing from a robot's sensors, the monitors in an intensive care unit, or p...
Property testing deals with tasks where the goal is to distinguish between the case that an object (e.g., function or graph) has a prespecified property (e.g., the function is li...
Abstract—MapReduce is emerging as a generic parallel programming paradigm for large clusters of machines. This trend combined with the growing need to run machine learning (ML) a...
Amol Ghoting, Rajasekar Krishnamurthy, Edwin P. D....
Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This p...
One problem of data-driven answer extraction in open-domain factoid question answering is that the class distribution of labeled training data is fairly imbalanced. This imbalance...
Michael Wiegand, Jochen L. Leidner, Dietrich Klako...