We introduce Scioto, Shared Collections of Task Objects, a lightweight framework for providing task management on distributed memory machines under one-sided and globalview parall...
James Dinan, Sriram Krishnamoorthy, D. Brian Larki...
To continue to improve processor performance, microarchitects seek to increase the effective instruction level parallelism (ILP) that can be exploited in applications. A fundament...
Optimizing the performance of dynamic load balancing toolkits and applications requires the adjustment of several runtime parameters; however, determining sufficiently good value...
Many complex systems require the use of floating point arithmetic that is exceedingly time consuming to perform on personal computers. However, floating point operators are also h...
Jason Lee, Lesley Shannon, Matthew J. Yedlin, Gary...
Using Linux for high-performance applications on the compute nodes of IBM Blue Gene/P is challenging because of TLB misses and difficulties with programming the network DMA engine...
Kazutomo Yoshii, Kamil Iskra, Harish Naik, Pete Be...