We propose an efficient sampling based outlier detection method for large high-dimensional data. Our method consists of two phases. In the first phase, we combine a "sampling...
Timothy de Vries, Sanjay Chawla, Pei Sun, Gia Vinh...
For small samples, classi er design algorithms typically suffer from over tting. Given a set of features, a classi er must be designed and its error estimated. For small samples, ...
Seungchan Kim, Edward R. Dougherty, Junior Barrera...
Background: The number of algorithms available to predict ligand-protein interactions is large and ever-increasing. The number of test cases used to validate these methods is usua...
Luis A. Diago, Persy Morell, Longendri Aguilera, E...
Parallel coordinate plots (PCPs) are commonly used in information visualization to provide insight into multi-variate data. These plots help to spot correlations between variables....
Efficient index construction in multidimensional data spaces is important for many knowledge discovery algorithms, because construction times typically must be amortized by perform...