Sciweavers

482 search results - page 30 / 97
» A large-scale study of failures in high-performance computin...
Sort
View
146
Voted
DSN
2002
IEEE
15 years 8 months ago
Time-Constrained Failure Diagnosis in Distributed Embedded Systems
—Advanced automotive control applications such as steer-by-wire are typically implemented as distributed systems comprising many embedded processors, sensors, and actuators inter...
Nagarajan Kandasamy, John P. Hayes, Brian T. Murra...
143
Voted
ICPPW
2002
IEEE
15 years 8 months ago
A Study of Dynamic Routing and Wavelength Assignment with Imprecise Network State Information
In large networks, maintaining precise global network state information is almost impossible. Many factors, such as non-negligible propagation delay, infrequent state updates due ...
Jun Zhou, Xin Yuan
111
Voted
EUMAS
2006
15 years 4 months ago
DimaX: A Fault-Tolerant Multi-Agent Platform
Fault tolerance is an important property of large-scale multiagent systems as the failure rate grows with both the number of the hosts and deployed agents, and the duration of com...
Nora Faci, Zahia Guessoum, Olivier Marin
126
Voted
GRID
2007
Springer
15 years 9 months ago
On the dynamic resource availability in grids
— Currently deployed grids gather together thousands of computational and storage resources for the benefit of a large community of scientists. However, the large scale, the wid...
Alexandru Iosup, Mathieu Jan, Omer Ozan Sonmez, Di...
P2P
2009
IEEE
137views Communications» more  P2P 2009»
15 years 10 months ago
Analysis of Failure Correlation Impact on Peer-to-Peer Storage Systems
Abstract—Peer-to-peer storage systems aim to provide a reliable long-term storage at low cost. In such systems, peers fail continuously, hence, the necessity of self-repairing me...
Olivier Dalle, Frédéric Giroire, Jul...