Abstract. Most parallel systems on which MPI is used are now hierarchical: some processors are much closer to others in terms of interconnect performance. One of the most common su...
Hao Zhu, David Goodell, William Gropp, Rajeev Thak...
With the development of computer systems, function inlining schemes were used to reduce execution time while increasing codes. In embedded systems such as wireless sensor nodes, t...
Improving memory performance at software level is more effective in reducing the rapidly expanding gap between processor and memory performance. Loop transformations (e.g. loop un...
Surendra Byna, Xian-He Sun, William Gropp, Rajeev ...
Software packet processing is becoming more important to enable differentiated and rapidly-evolving network services. With increasing numbers of programmable processor and acceler...
Embedded consumer devices are increasing their capabilities and can now implement new multimedia applications reserved only for powerful desktops a few years ago. These applicatio...
David Atienza, Christos Baloukas, Lazaros Papadopo...