This paper describes a novel approach to generate an optimized schedule to run threads on distributed shared memory (DSM) systems. The approach relies upon a binary instrumentatio...
Multimedia vector instruction sets are becoming ubiquitous in most of the embedded systems used for multimedia, networking and communications. However, current compiler technology...
In programming high performance applications, shared address-space platforms are preferable for fine-grained computation, while distributed address-space platforms are more suita...
A key challenge in achieving high performance on software DSM systems is overcoming their relatively large communication latencies. In this paper, we consider two techniques which...
The widening gap between processor and memory performance is the main bottleneck for modern computer systems to achieve high processor utilization. In this paper, we propose a new...
Chun Xue, Zili Shao, Meilin Liu, Mei Kang Qiu, Edw...