Abstract Wireless sensor networks (WSN) are designed for data gathering and processing, with particular requirements: low hardware complexity, low energy consumption, special traff...
The development of high performance dense linear algebra (DLA) critically depends on highly optimized BLAS, and especially on the matrix multiplication routine (GEMM). This is espe...
The problem of writing high performance parallel applications becomes even more challenging when irregular, sparse or adaptive methods are employed. In this paper we introduce com...
Recently there has been quite a number of papers discussing the use of redundant 4-to-2 adders for the accumulation of partial products in multipliers, claiming one type to be sup...
Tuning compiler optimizations for rapidly evolving hardware makes porting and extending an optimizing compiler for each new platform extremely challenging. Iterative optimization i...
Grigori Fursin, Yuriy Kashnikov, Abdul Wahid Memon...