Suppose that a program makes a sequence of m accesses (references) to data blocks, the cache can hold k < m blocks, an access to a block in the cache incurs one time unit, and ...
In a parallel system with multiple CPUs, one of the key problems is to assign loop iterations to processors. This problem, known as the loop scheduling problem, has been studied i...
Mahmut T. Kandemir, Taylan Yemliha, Seung Woo Son,...
Supercomputers need a huge budget to be built and maintained. To maximize the usage of their resources, application developers spend time to optimize the code of the parallel appl...
Thread-level speculation (TLS) has proven to be a promising method of extracting parallelism from both integer and scientific workloads, targeting speculative threads that range ...
Christopher B. Colohan, Anastassia Ailamaki, J. Gr...
As the number of cores continues to grow in both digital signal and general purpose processors, tools which perform automatic scheduling from model-based designs are of increasing...
Maxime Pelcat, Pierrick Menuet, Slaheddine Aridhi,...