Sciweavers

139 search results - page 22 / 28
» ic 2009
Sort
View
ICS
2009
Tsinghua U.
14 years 2 months ago
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
Jiayuan Meng, Kevin Skadron
ICS
2009
Tsinghua U.
14 years 2 months ago
Zero-content augmented caches
It has been observed that some applications manipulate large amounts of null data. Moreover these zero data often exhibit high spatial locality. On some applications more than 20%...
Julien Dusser, Thomas Piquet, André Seznec
ICS
2009
Tsinghua U.
14 years 2 months ago
Combining thread level speculation helper threads and runahead execution
With the current trend toward multicore architectures, improved execution performance can no longer be obtained via traditional single-thread instruction level parallelism (ILP), ...
Polychronis Xekalakis, Nikolas Ioannou, Marcelo Ci...
ICS
2009
Tsinghua U.
14 years 2 months ago
High-performance CUDA kernel execution on FPGAs
In this work, we propose a new FPGA design flow that combines the CUDA programming model from Nvidia with the state of the art high-level synthesis tool AutoPilot from AutoESL, to...
Alexandros Papakonstantinou, Karthik Gururaj, John...
ICS
2009
Tsinghua U.
14 years 2 months ago
/scratch as a cache: rethinking HPC center scratch storage
To sustain emerging data-intensive scientific applications, High Performance Computing (HPC) centers invest a notable fraction of their operating budget on a specialized fast sto...
Henry M. Monti, Ali Raza Butt, Sudharshan S. Vazhk...