We consider the problem of finding highly correlated pairs in a large data set. That is, given a threshold not too small, we wish to report all the pairs of items (or binary attri...
Network operators are reluctant to share traffic data due to security and privacy concerns. Consequently, there is a lack of publicly available traces for validating and generaliz...
Martin Burkhart, Daniela Brauckhoff, Martin May, E...
Background: In the analysis of microarray data one generally produces a vector of p-values that for each gene give the likelihood of obtaining equally strong evidence of change by...
: The k nearest neighbor classification (k-NN) is a very simple and popular method for classification. However, it suffers from a major drawback, it assumes constant local class po...