We propose a technique which leverages configurable data caches to address the problem of cache interference in multitasking embedded systems. Data caches are often necessary to p...
Several techniques have been developed for identifying similar code fragments in programs. These similar fragments, referred to as code clones, can be used to identify redundant c...
he significant increase in the available computational power that took place in recent decades has been accompanied by a growing interest in the application of the evolutionary ap...
We present a number of optimization techniques to compute prefix sums on linked lists and implement them on multithreaded GPUs using CUDA. Prefix computations on linked structures ...
This paper describes a proposal for a set of Parallel Basic Linear Algebra Subprograms PBLAS. The PBLAS are targeted at distributed vector-vector, matrix-vector and matrixmatrix...
Jaeyoung Choi, Jack Dongarra, Susan Ostrouchov, An...