Message Passing Interface (MPI) is a widely used standard for managing coarse-grained concurrency on distributed computers. Debugging parallel MPI applications, however, has alway...
Ruini Xue, Xuezheng Liu, Ming Wu, Zhenyu Guo, Weng...
Due to power wall, memory wall, and ILP wall, we are facing the end of ever increasing single-threaded performance. For this reason, multicore and manycore processors are arising ...
Dependence-aware transactional memory (DATM) is a recently proposed model for increasing concurrency of memory transactions without complicating their interface. DATM manages depe...
Hany E. Ramadan, Indrajit Roy, Maurice Herlihy, Em...
Researchers in transactional memory (TM) have proposed open nesting as a methodology for increasing the concurrency of transactional programs. The idea is to ignore "low-leve...
On multiprocessors with explicitly managed memory hierarchies (EMM), software has the responsibility of moving data in and out of fast local memories. This task can be complex and...
Scott Schneider, Jae-Seung Yeom, Benjamin Rose, Jo...
Recent advances in polyhedral compilation technology have made it feasible to automatically transform affine sequential loop nests for tiled parallel execution on multi-core proce...
This paper presents a system deployed on parallel clusters to manage a collection of parallel simulations that make up a computational study. It explores how such a system can ext...
In a previous paper we show how the FLAME methods and tools provide a solution to compute dense dense linear algebra operations on a multi-GPU platform with reasonable performance...
This paper introduces a new way to provide strong atomicity in an implementation of transactional memory. Strong atomicity lets us offer clear semantics to programs, even if they ...