Abstract. Data sets in large applications are often too massive to t completely inside the computer's internal memory. The resulting input output communication or I O between ...
The memory hierarchy of most multicore systems contains one or more levels of cache that is shared among multiple cores. The shared-cache architecture presents many opportunities f...
A heuristic algorithm that maps data-processing tasks onto heterogeneous resources (i.e., processors and links of various capacities) is presented. The algorithm tries to achieve ...
Previous studies of application usage show that the performance of collective communications are critical for high-performance computing and are often overlooked when compared to ...
Jelena Pjesivac-Grbovic, Thara Angskun, George Bos...
With the advent of chip-multiprocessors, we are faced with the challenge of parallelizing performance-critical software. Transactional memory (TM) has emerged as a promising progr...
Marek Olszewski, Jeremy Cutler, J. Gregory Steffan