Exploiting compile time knowledge to improve memory bandwidth can produce noticeable improvements at run-time [13, 1]. Allocating the data structure [13] to separate memories when...
The efficient mapping of program parallelism to multi-core processors is highly dependent on the underlying architecture. This paper proposes a portable and automatic compiler-bas...
In this paper we describe a framework that supports runtime service discovery in both pull and push modes. Our framework supports service discovery based on structural and behavio...
Thread-level speculation is a technique that brings thread-level parallelism beyond the data-flow limit by executing a piece of code ahead of time speculatively before all its inp...
Xiao-Feng Li, Zhao-Hui Du, Chen Yang, Chu-Cheow Li...
Competitive parallel execution (CPE) is a simple yet attractive technique to improve the performance of sequential programs on multi-core and multi-processor systems. A sequential...