hyperobjects (reducers) provide a linguistic abstraction for dynamic multithreading that allows different branches of a parallel program to maintain coordinated local views of the...
I.-Ting Angelina Lee, Aamir Shafi, Charles E. Leis...
Chip multiprocessors enable continued performance scaling with increasingly many cores per chip. As the throughput of computation outpaces available memory bandwidth, however, the...
Doe Hyun Yoon, Min Kyu Jeong, Michael Sullivan, Ma...
The least recently used (LRU) replacement policy performs poorly in the last-level cache (LLC) because temporal locality of memory accesses is filtered by first and second level...
Samira Manabi Khan, Zhe Wang, Daniel A. Jimé...
Despite a burgeoning demand for parallel programs, the tools available to developers working on shared-memory multicore processors have lagged behind. One reason for this is the l...
Marek Olszewski, Qin Zhao, David Koh, Jason Ansel,...
The deployment of computer vision algorithms in mobile applications is growing at a rapid pace. A primary component of the computer vision software pipeline is feature extraction,...
Jason Clemons, Andrew Jones, Robert Perricone, Sil...
The assessment of chemical similarity between molecules is a basic operation in chemoinformatics, a computational area concerning with the manipulation of chemical structural info...
—Memory corruption, reading uninitialized memory, using freed memory, and other memory-related errors are among the most difficult programming bugs to identify and fix due to t...
Deterministic replay systems record and reproduce the execution of a hardware or software system. In contrast to replaying execution on uniprocessors, deterministic replay on mult...
Kaushik Veeraraghavan, Dongyoon Lee, Benjamin West...
Concurrency bugs are caused by non-deterministic interleavings between shared memory accesses. Their effects propagate through data and control dependences until they cause softwa...
Wei Zhang, Junghee Lim, Ramya Olichandran, Joel Sc...
Abstract--In this paper, we propose a new architectural technique to reduce energy dissipation of frame memory through adaptive bitwith compression. Unlike related approaches, the ...