Message passing using the Message Passing Interface (MPI) is at present the most widely adopted framework for programming parallel applications for distributed-memory and clustere...
The performance of most embedded systems is critically dependent on the memory hierarchy performance. In particular, higher cache hit rate can provide significant performance boos...
Abstract--Many shared-memory parallel systems use lockbased synchronization mechanisms to provide mutual exclusion or reader-writer access to memory locations. Software locks are i...
Most existing work uses dual decomposition and subgradient methods to solve Network Utility Maximization (NUM) problems in a distributed manner, which suffer from slow rate of con...
In recent years, the increasing number of processor cores and limited increases in main memory bandwidth have led to the problem of the bandwidth wall, where memory bandwidth is b...
Guangyu Sun, Christopher J. Hughes, Changkyu Kim, ...