Abstract. An efficient parallel algorithm is presented and tested for computing selected components of H−1 where H has the structure of a Hamiltonian matrix of two-dimensional la...
Lin Lin, Chao Yang, Jianfeng Lu, Lexing Ying, Wein...
An important means of validating the design of commercial-grade shared memory multiprocessors is to run a large number of pseudo-random test programs on them. However, when intent...
This paper presents a technique, adaptive replication, for automatically eliminating synchronization bottlenecks in multithreaded programs that perform atomic operations on object...
Shared-memory provides a uniform and attractive mechanism for communication. For efficiency, it is often implemented with a layer of interpretive hardware on top of a message-pas...
Application performance tuning is a complex process that requires assembling various types of information and correlating it with source code to pinpoint the causes of performance...
John M. Mellor-Crummey, Robert J. Fowler, David B....