Microarray analysis using clustering algorithms can suffer from lack of inter-method consistency in assigning related gene-expression profiles to clusters. Obtaining a consensus s...
Paul Kellam, Stephen Swift, Allan Tucker, Veronica...
MapReduce is emerging as an important programming model for large-scale data-parallel applications such as web indexing, data mining, and scientific simulation. Hadoop is an open-...
Matei Zaharia, Andy Konwinski, Anthony D. Joseph, ...
We consider the concepts of a t-total vertex cover and a t-total edge cover (t 1), which generalize the notions of a vertex cover and an edge cover, respectively. A t-total verte...
We consider k-median clustering in finite metric spaces and k-means clustering in Euclidean spaces, in the setting where k is part of the input (not a constant). For the k-means pr...
Many real datasets have uncertain categorical attribute values that are only approximately measured or imputed. Uncertainty in categorical data is commonplace in many applications...