Recent advances in polyhedral compilation technology have made it feasible to automatically transform affine sequential loop nests for tiled parallel execution on multi-core proce...
Tiling exploits temporal reuse carried by an outer loop of a loop nest to enhance cache locality. Loop skewing is typically required to make tiling legal. This restricts parallelis...
Techniques to reduce power dissipation for embedded systems have recently come into sharp focus in the technology development. Among these techniques, dynamic voltage scaling (DVS)...
One way to exploit Thread Level Parallelism (TLP) is to use architectures that implement novel multithreaded execution models, like Scheduled DataFlow (SDF). This latter model pro...
In this paper we propose a new class of scheduling policies, dubbed Concurrent Gang, that combines the advantages of gang scheduling for communication and synchronization intensiv...