The current multiprocessors such asCray T3D support interprocessor communication using partitioned dimension-order routers (PDRs). In a PDR implementation, the routing logic and sw...
As high performance clusters continue to grow in size, the mean time between failure shrinks. Thus, the issues of fault tolerance and reliability are becoming one of the challengi...
The web services technology provides an approach for developing distributed applications by using simple and well defined interfaces. Due to the flexibility of this architecture, ...
Jim Lau, Lau Cheuk Lung, Joni da Silva Fraga, Giul...
As the size and popularity of computer clusters go on growing, fault tolerance is becoming a crucial factor to ensure high performance and reliability for applications. To provide...
Antonio S. Martins, Ronaldo Augusto Lara Gon&ccedi...
Building dependable distributed systems using ad hoc methods is a challenging task. Without proper support, an application programmer must face the daunting requirement of having ...
Jennifer Ren, Michel Cukier, Paul Rubel, William H...