The similarity join is an important operation for mining high-dimensional feature spaces. Given two data sets, the similarity join computes all tuples (x, y) that are within a dis...
Using simulated data to develop and study diagnostic tools for data analysis is very beneficial. The user can gain insight about what happens when assumptions are violated since t...
We consider the problem of learning a ranking function that maximizes a generalization of the Wilcoxon-Mann-Whitney statistic on the training data. Relying on an -accurate approxim...
Vikas C. Raykar, Ramani Duraiswami, Balaji Krishna...
Imputation of missing values is one of the major tasks for data pre-processing in many areas. Whenever imputation of data from official statistics comes into mind, several (additi...
Matthias Templ, Alexander Kowarik, Peter Filzmoser
This paper presents a data oriented approach to modeling the complex computing systems, in which an ensemble of correlation models are discovered to represent the system status. I...