A proper understanding of communication patterns of parallel applications is important to optimize application performance and design better communication subsystems. Communicatio...
Phoebus is an infrastructure for improving end-to-end throughput in high-bandwidth, long-distance networks by using a “session layer” protocol and “gateways” in the networ...
In this paper, we consider the problem of supporting fault tolerance for adaptive and time-critical applications in heterogeneous and unreliable grid computing environments. Our g...
The UNIC code is being developed as part of the DOE’s Nuclear Energy Advanced Modeling and Simulation (NEAMS) program. UNIC is an unstructured, deterministic neutron transport c...
Dinesh K. Kaushik, Micheal Smith, Allan Wollaber, ...
In this paper, we discuss some of the lessons that we have learned working with the Hadoop and Sector/Sphere systems. Both of these systems are cloud-based systems designed to sup...
In Fine-Grained Cycle Sharing (FGCS) systems, machine owners voluntarily share their unused CPU cycles with guest jobs, as long as the performance degradation is tolerable. For gu...
Tanzima Zerin Islam, Saurabh Bagchi, Rudolf Eigenm...
In the quest for cognitive computing, we have built a massively parallel cortical simulator, C2, that incorporates a number of innovations in computation, memory, and communicatio...
Rajagopal Ananthanarayanan, Steven K. Esser, Horst...
This paper presents a dynamic task scheduling approach to executing dense linear algebra algorithms on multicore systems (either shared-memory or distributed-memory). We use a tas...