Abstract. How can we exploit a microprocessor as efficiently as possible? The “classic” approach is static optimization at compile-time, optimizing a program for all possible u...
Kevin Streit, Clemens Hammacher, Andreas Zeller, S...
We show a method for parallelizing top down dynamic programs in a straightforward way by a careful choice of a lock-free shared hash table implementation and randomization of the ...
Alex Stivala, Peter J. Stuckey, Maria Garcia de la...
: The present paper discusses the implementations of sparse matrix-vector products, which are crucial for high performance solutions of large-scale linear equations, on a PC-Cluste...
Embedded Block Coding with Optimal Truncation (EBCOT) is the fundamental and computationally very demanding part of the compression process of JPEG2000 image compression standard. ...
For high-data-rate applications, turbo-like iterative decoders are implemented with parallel hardware architecture. However, to achieve high throughput, concurrent accesses to each...
Awais Sani, Philippe Coussy, Cyrille Chavet, Eric ...