Recently, the high-performance computing community has realized that power is a performance-limiting factor. One reason for this is that supercomputing centers have limited power ...
Robert Springer, David K. Lowenthal, Barry Rountre...
This work presents a general methodology for estimating the performance of an HPC workload when running on a future hardware architecture. Further, it demonstrates the methodology...
Ilya Sharapov, Robert Kroeger, Guy Delamarter, Raz...
We present two new nonblocking and contention-free implementations of synchronous queues, concurrent transfer channels in which producers wait for consumers just as consumers wait...
William N. Scherer III, Doug Lea, Michael L. Scott
Applications need to become more concurrent to take advantage of the increased computational power provided by chip level multiprocessing. Programmers have traditionally managed t...
Bratin Saha, Ali-Reza Adl-Tabatabai, Richard L. Hu...
Performance analysis tools are critical for the effective use of large parallel computing resources, but existing tools have failed to address three problems that limit their scal...
We study a family of implementations for linked lists using finegrain synchronisation. This approach enables greater concurrency, but correctness is a greater challenge than for ...
Viktor Vafeiadis, Maurice Herlihy, Tony Hoare, Mar...
We investigate a transactional memory runtime system providing scaling and strong consistency for generic C++ and SQL applications on commodity clusters. We introduce a novel page...
As multi-core architectures with Thread-Level Speculation (TLS) are becoming better understood, it is important to focus on TLS compilation. TLS compilers are interesting in that,...
Wei Liu, James Tuck, Luis Ceze, Wonsun Ahn, Karin ...