Effective overlap of computation and communication is a well understood technique for latency hiding and can yield significant performance gains for applications on high-end compu...
Aniruddha G. Shet, P. Sadayappan, David E. Bernhol...
The rapid growth of silicon densities has made it feasible to deploy reconfigurable hardware as a highly parallel computing platform. However, in most cases, the application needs...
Girish Venkataramani, Walid A. Najjar, Fadi J. Kur...
Abstract. Conventional performance environments are based on pro ling and event instrumentation. It becomes problematic as parallel systems scale to hundreds of nodes and beyond. A...
Xian-He Sun, Mario Pantano, Thomas Fahringer, Zhao...