Simple presentation graphics are intuitive and easy-to-use, but only show highly aggregated data. Bar charts, for example, only show a rather small number of data values and x-y-pl...
—A novel method CLOSS intended for textual databases is proposed. It successfully identifies misspelled string clusters, even if the cluster border is not prominent. The method ...
The assumptions behind linear classifiers for categorical data are examined and reformulated in the context of the multinomial manifold, the simplex of multinomial models furnishe...
We present an algorithm for mining association rules from relational tables containing numeric and categorical attributes. The approach is to merge adjacent intervals of numeric v...
We present a probabilistic approach to learning a Gaussian Process classifier in the presence of unlabeled data. Our approach involves a "null category noise model" (NCN...