The current trend is for processors to deliver dramatic improvements in parallel performance while only modestly improving serial performance. Parallel performance is harvested th...
Sanjeev Kumar, Daehyun Kim, Mikhail Smelyanskiy, Y...
Transactional memory is a promising, optimistic synchronization mechanism for chip-multiprocessor systems. The simplicity of atomic sections, instead of using explicit locks, is al...
Field Programmable Gate Arrays (FPGAs) have long held the promise of allowing designers to create systems with performance levels close to custom circuits but with a software-like...
The development of high-performance libraries has become extraordinarily difficult due to multiple processor cores, vector instruction sets, and deep memory hierarchies. Often, t...
The emergence of global-scale online services has galvanized scale-out software, characterized by splitting vast datasets and massive computation across many independent servers. ...
Pejman Lotfi-Kamran, Boris Grot, Michael Ferdman, ...