Clustering performance data efficiently at massive scales

15 years 9 months ago

Download wwwx.cs.unc.edu

Existing supercomputers have hundreds of thousands of processor cores, and future systems may have hundreds of millions. Developers need detailed performance measurements to tune their applications and to exploit these systems fully. However, extreme scales pose unique challenges for performance-tuning tools, which can generate significant volumes of I/O. Compute-to-I/O ratios have increased drastically as systems have grown, and the I/O systems of large machines can handle the peak load from only a small fraction of cores. Tool developers need efficient techniques to analyze and to reduce performance data from large numbers of cores. We introduce CAPEK, a novel parallel clustering algorithm that enables in-situ analysis of performance data at run time. Our algorithm scales sub-linearly to 131,072 processes, running in less than one second even at that scale, which is fast enough for on-line use in production runs. The CAPEK implementation is fully generic and can be used for many typ...

Todd Gamblin, Bronis R. de Supinski, Martin Schulz

Real-time Traffic

Detailed Performance Measurements | Distributed And Parallel Computing | ICS 2010 | Performance Data | Run Time |

claim paper

» DataGarage Warehousing Massive Performance Data on Commodity Servers

» SCOPE easy and efficient parallel processing of massive data sets

» A Scalable Parallel Subspace Clustering Algorithm for Massive Data Sets

» Efficient Massive Sharing of Content among Peers

» An efficient scalable and flexible data transfer architecture for multiprocessor SoC with ...

» Efficient Metadata Generation to Enable Interactive Data Discovery over LargeScale Scienti...

» Efficient DataMovement for Lightweight IO

» Massive Semantic Web data compression with MapReduce

Post Info
More Details (n/a)

Added	29 Sep 2010
Updated	29 Sep 2010
Type	Conference
Year	2010
Where	ICS
Authors	Todd Gamblin, Bronis R. de Supinski, Martin Schulz, Robert J. Fowler, Daniel A. Reed

Comments (0)

Sciweavers

Clustering performance data efficiently at massive scales

Detailed Performance Measurements | Distributed And Parallel Computing | ICS 2010 | Performance Data | Run Time |

Explore & Download

Productivity Tools

Sciweavers