Efficient data storage, a major concern in the modern computer industry, is mostly provided today by the the traditional magnetic disk. Unfortunately the cost of a disk transfer m...
Data clustering is a difficult problem due to the complex and heterogeneous natures of multidimensional data. To improve clustering accuracy, we propose a scheme to capture the lo...
Addressing the long term preservation issues associated with scientific data is a complex challenge compounded by: the scale and multidisciplinary nature of the problem; the wide ...
Abstract. In this paper we present a coarse-grained parallel algorithm, CONQUEST, for constructing boundederror summaries of high-dimensional binary attributed data in a distribute...
A methodology for automatically identifying and clustering semantic features or topics in a heterogeneous text collection is presented. Textual data is encoded using a low rank no...
Farial Shahnaz, Michael W. Berry, V. Paul Pauca, R...