This paper describes a proposal for a set of Parallel Basic Linear Algebra Subprograms PBLAS. The PBLAS are targeted at distributed vector-vector, matrix-vector and matrixmatrix...
Jaeyoung Choi, Jack Dongarra, Susan Ostrouchov, An...
Rapid increases in computing and communication performance are exacerbating the long-standing problem of performance-limited input/output. Indeed, for many otherwise scalable para...
Phyllis Crandall, Ruth A. Aydt, Andrew A. Chien, D...
Reliable channel state information at the transmitter (CSIT) can improve the throughput of wireless networks significantly. In a realistic scenario, there is a mismatch between th...
Christof Jonietz, Wolfgang H. Gerstacker, Robert S...
Tuning compiler optimizations for rapidly evolving hardware makes porting and extending an optimizing compiler for each new platform extremely challenging. Iterative optimization i...
Grigori Fursin, Yuriy Kashnikov, Abdul Wahid Memon...
For embedded system development, several companies provide cross-platform development tools to aid in debugging, prototyping and optimization of programs. These are full system em...
Cristiano Pereira, Jeremy Lau, Brad Calder, Rajesh...