Advanced bit manipulation operations are not efficiently supported by commodity word-oriented microprocessors. Programming tricks are typically devised to shorten the long sequence...
Modulo scheduling is an effective code generation technique that exploits the parallelism in program loops by overlapping iterations. One drawback of this optimization is that reg...
We study the communication complexity of rumor spreading in the random phone-call model. Suppose n players communicate in parallel rounds, where in each round every player calls a...
This paper argues for an implicitly parallel programming model for many-core microprocessors, and provides initial technical approaches towards this goal. In an implicitly paralle...
Wen-mei W. Hwu, Shane Ryoo, Sain-Zee Ueng, John H....
Traditional parallel programming styles have many problems which hinder the development of parallel applications. The message passing style can be too complex for many programmers...