The major challenge in mining data streams is the issue of concept drift, the tendency of the underlying data generation process to change over time. In this paper, we propose a g...
Schema matching is the problem of finding correspondences (mapping rules, e.g. logical formulae) between heterogeneous schemas e.g. in the data exchange domain, or for distribute...
We call data weakly labeled if it has no exact label but rather a numerical indication of correctness of the label "guessed" by the learning algorithm - a situation comm...
In EM and related algorithms, E-step computations distribute easily, because data items are independent given parameters. For very large data sets, however, even storing all of th...
In this paper, we propose a general framework for distributed boosting intended for efficient integrating specialized classifiers learned over very large and distributed homogeneo...