One of the most important collective communication patterns used in scientific applications is the complete exchange, also called All-to-All. Although efficient algorithms have b...
In this work we investigate how the compiler technique of message strip mining performs in practice on contemporary high performance networks. Message strip mining attempts to redu...
We apply a scalable approach for practical, comprehensive design space evaluation and optimization. This approach combines design space sampling and statistical inference to ident...
The model of bulk-synchronous parallel computation (BSP) helps to implement portable general purpose algorithms while keeping predictable performance on different parallel compute...
“Is transactional memory useful?” is the question that cannot be answered until we provide substantial applications that can evaluate its capabilities. While existing TM appli...
Vladimir Gajinov, Ferad Zyulkyarov, Osman S. Unsal...