Abstract — Distributed sensor system applications (e.g., wireless sensor networks) have been studied extensively in recent years. Such applications involve resource-limited embed...
Chung-Ching Shen, Roni Kupershtok, Shuvra S. Bhatt...
This poster paper summarizes our research on fault tolerance arising as a by-product of the evolutionary computation process. Past research has shown evidence of robustness emergin...
We propose group communication as an efficient mechanism to support fault tolerance. Our approach is based on an efficient reliable broadcast protocol that requires on average onl...
—Computing systems will grow significantly larger in the near future to satisfy the needs of computational scientists in areas like climate modeling, biophysics and cosmology. S...
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current t...
Arun Babu Nagarajan, Frank Mueller, Christian Enge...