Sciweavers

1113 search results - page 6 / 223
» Performance under Failures of DAG-based Parallel Computing
Sort
View
ICDCS
1998
IEEE
13 years 11 months ago
A Feedback Based Scheme for Improving TCP Performance in Ad-Hoc Wireless Networks
Ad-hoc networks consist of a set of mobile hosts that communicate using wireless links, without the use of other communication support facilities (such as base stations). The topo...
Kartik Chandran, Sudarshan Raghunathan, S. Venkate...
HPCC
2009
Springer
13 years 5 months ago
Reliability Optimization of Reconfigurable Computing-Based Fault-Tolerant System
Domain-partition (DP) model is a general model for reliability maximization problem under given redundancy. In this paper, an improved DP model is used to formulate a reconfigurati...
Mi Zhou, Lihong Shang, Yu Hu
CLOUDCOM
2010
Springer
13 years 5 months ago
Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud
Cloud computing has seen tremendous growth, particularly for commercial web applications. The on-demand, pay-as-you-go model creates a flexible and cost-effective means to access c...
Keith R. Jackson, Lavanya Ramakrishnan, Krishna Mu...
CCGRID
2010
IEEE
13 years 8 months ago
Selective Recovery from Failures in a Task Parallel Programming Model
Abstract--We present a fault tolerant task pool execution environment that is capable of performing fine-grain selective restart using a lightweight, distributed task completion tr...
James Dinan, Arjun Singri, P. Sadayappan, Sriram K...
PDPTA
2010
13 years 5 months ago
Collecting Sensor Data for High-Performance Computing: A Case-study
- Many research questions remain open with regard to improving reliability in exascale systems. Among others, statistics-based analysis has been used to find anomalies, to isolate ...
Line C. Pouchard, Jonathan D. Dobson, Stephen W. P...