As the number of cores and threads in manycore compute accelerators such as Graphics Processing Units (GPU) increases, so does the importance of on-chip interconnection network des...
In heterogeneous and dynamic distributed systems like the Grid, detailed monitoring of workload and its resulting system performance (e.g. response time) is required to facilitate...
Rui Zhang, Steve Moyle, Steve McKeever, Stephen He...
A traditional fixed-function graphics accelerator has evolved into a programmable general-purpose graphics processing unit over the last few years. These powerful computing cores...
We present a performance model-driven framework for automated performance tuning (autotuning) of sparse matrix-vector multiply (SpMV) on systems accelerated by graphics processing...
The computational power provided by many-core graphics processing units (GPUs) has been exploited in many applications. The programming techniques currently employed on these GPUs...
Long Chen, Oreste Villa, Sriram Krishnamoorthy, Gu...