Speculative parallelization aggressively executes in parallel codes that cannot be fully parallelized by the compiler. Past proposals of hardware schemes have mostly focused on si...
The thesis of this research is that the task of exposing the parallelism in a given application should be left to the algorithm designer, who has intimate knowledge of the applica...
The algorithms in the current sequential numerical linear algebra libraries (e.g. LAPACK) do not parallelize well on multicore architectures. A new family of algorithms, the tile a...
Emmanuel Agullo, Henricus Bouwmeester, Jack Dongar...
This paper presents a novel approach for the problem of generating tiled code for nested for-loops using a tiling transformation. Tiling or supernode transformation has been widel...
Georgios I. Goumas, Maria Athanasaki, Nectarios Ko...
—Thread-level parallelism (TLP) has been extensively studied in order to overcome the limitations of exploiting instruction-level parallelism (ILP) on high-performance superscala...