Data-parallel accelerator devices such as Graphical Processing Units (GPUs) are providing dramatic performance improvements over even multicore CPUs for lattice-oriented applicatio...
Writing shared-memory parallel programs is error-prone. Among the concurrency errors that programmers often face are atomicity violations, which are especially challenging. They h...
Brandon Lucia, Joseph Devietti, Karin Strauss, Lui...
Detecting data races in parallel programs is important for both software development and production-run diagnosis. Recently, there have been several proposals for hardware-assiste...
— One of the critical goals in code optimization for MPSoC architectures is to minimize the number of off-chip memory accesses. This is because such accesses can be extremely cos...
— This paper focuses on the transfer of large data in SMP systems. Achieving good performance for intranode communication is critical for developing an efficient communication s...