Abstract—As programmers are asked to manage more complicated parallel machines, it is likely that they will become increasingly dependent on tools such as multi-threaded data rac...
We propose a parallel global routing algorithm that concurrently processes routing subproblems corresponding to rectangular subregions covering the chip area. The algorithm uses a...
Tai-Hsuan Wu, Azadeh Davoodi, Jeffrey T. Linderoth
A concurrent partitioner for partitioning unstructured finite element meshes on distributed memory architectures is developed. The partitioner uses an element-based partitioning st...
Memory performance is increasingly determining microprocessor performance and technology trends are exacerbating this problem. Most architectures use set-associative caches with L...
Zhenlin Wang, Kathryn S. McKinley, Arnold L. Rosen...
—Remote atomic memory operations are critical for achieving high-performance synchronization in tightly-coupled systems. Previous approaches to implementing atomic memory operati...
Keith D. Underwood, Michael Levenhagen, K. Scott H...