Density Estimation and Visualization for Data Containing Clusters of Unknown Structure

16 years 8 hour ago

Download www.mathematik.uni-marburg.de

Abstract. A method for measuring the density of data sets that contain an unknown number of clusters of unknown sizes is proposed. This method, called Pareto Density Estimation (PDE), uses hyper spheres to estimate data density. The radius of the hyper spheres is derived from information optimal sets. PDE leads to a tool for the visualization of probability density distributions of variables (PDEplot). For Gaussian mixture data this is an optimal empirical density estimation. A new kind of visualization of the density structure of high dimensional data set, the P-Matrix is deﬁned. The P-Matrix for a 79- dimensional data set from DNA array analysis is shown. The P-Matrix reveals local concentrations of data points representing similar gene expressions. The P-Matrix is also a very eﬀective tool in the detection of clusters and outliers in data sets.

Alfred Ultsch

Real-time Traffic

Data Sets | Density Estimation | GFKL 2004 | Hyper Spheres |

claim paper

» Visual analysis of high dimensional point clouds using topological landscapes

» Hierarchical Splatting of Scattered Data

» Gaussian Process Structural Equation Models with Latent Variables

» Whole genome association mapping by incompatibilities and local perfect phylogenies

Post Info
More Details (n/a)

Added	01 Jul 2010
Updated	01 Jul 2010
Type	Conference
Year	2004
Where	GFKL
Authors	Alfred Ultsch

Comments (0)

Sciweavers

Density Estimation and Visualization for Data Containing Clusters of Unknown Structure

Data Sets | Density Estimation | GFKL 2004 | Hyper Spheres |

Explore & Download

Productivity Tools

Sciweavers