Software-controlled data prefetching is a promising technique for improving the performance of the memory subsystem to match today's high-performance processors. While prefet...
Nowadays, key characteristics of a processor's instruction set are only exploited in high-level languages by using inline assembly or compiler intrinsics. Inserting intrinsic...
Abstract. Synchronous programs may contain cyclic signal interdependencies. This prohibits a static scheduling, which limits the choice of available compilation techniques for such...
We present an efficient base algorithm for binding-time analysis based on constraint solving and the union-find algorithm. In practice it has been used to handle all of Standard M...
The performance benefits of GPU parallelism can be enormous, but unlocking this performance potential is challenging. The applicability and performance of GPU parallelizations is...
Thomas B. Jablin, Prakash Prabhu, James A. Jablin,...