— We develop distributed scheduling schemes that are based on simple random access algorithms and that have no message passing. In spite of their simplicity, these schemes are sh...
This paper describes a novel compile-time list-based task scheduling algorithm for distributed-memory systems, called Fast Load Balancing (FLB). Compared to other typical list sch...
Global variable promotion, i.e. allocating unaliased globals to registers, can significantly reduce the number of memory operations. This results in reduced cache activity and less...
Abstract. Buffered coscheduling is a new methodology that can substantially increase resource utilization, improve response time, and simplify the development of the run-time suppo...
To use multiple memory banks in parallel is a nature approach to boost the performance of flash-memory storage systems. However, realistic data-access localities unevenly load eac...