Optimally designing the location of training input points (active learning) and choosing the best model (model selection) are two important components of supervised learning and h...
This paper describes a successful but challenging application of data mining in the railway industry. The objective is to optimize maintenance and operation of trains through prog...
Bootstrapping is the process of improving the performance of a trained classifier by iteratively adding data that is labeled by the classifier itself to the training set, and retr...
This paper presents a novel way of improving POS tagging on heterogeneous data. First, two separate models are trained (generalized and domain-specific) from the same data set by...
Good data layouts improve cache and TLB performance of object-oriented software, but unfortunately, selecting an optimal data layout a priori is NP-hard. This paper introduces layo...