

Collective operations for wide-area message passing systems using adaptive spanning trees

14 years 8 months ago
Collective operations for wide-area message passing systems using adaptive spanning trees
Abstract— We propose a method for wide-area message passing systems to perform collective operations using dynamically created spanning trees. In our proposal, broadcasts and reductions are performed efficiently using topology-aware spanning trees constructed at run-time; processors autonomously measure latency and bandwidth to create latency-aware trees for short messages and bandwidth-aware trees for long messages. Our spanning trees adapt to topology changes due to the joining or leaving of processors; when processors join or leave a computation, processors repair the spanning trees so that effective execution of collective operations can continue. With 128 to 201 processors distributed over 3 to 4 clusters, the latency of our broadcast was within a factor of 2 of a static topology-aware implementation, and our broadcast achieved 82 percent of the bandwidth of a static topology-aware implementation. Moreover, when some processors joined or left a computation, our broadcast tempor...
Hideo Saito, Kenjiro Taura, Takashi Chikayama
Added 27 Jun 2010
Updated 27 Jun 2010
Type Conference
Year 2005
Where GRID
Authors Hideo Saito, Kenjiro Taura, Takashi Chikayama
Comments (0)