Sciweavers

1855 search results - page 352 / 371
» A coding theorem for distributed computation
Sort
View
HPCA
2007
IEEE
14 years 7 months ago
Improving Branch Prediction and Predicated Execution in Out-of-Order Processors
If-conversion is a compiler technique that reduces the misprediction penalties caused by hard-to-predict branches, transforming control dependencies into data dependencies. Althou...
Eduardo Quiñones, Joan-Manuel Parcerisa, An...
ICS
2009
Tsinghua U.
14 years 2 months ago
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
Jiayuan Meng, Kevin Skadron
ICS
2009
Tsinghua U.
14 years 2 months ago
High-performance CUDA kernel execution on FPGAs
In this work, we propose a new FPGA design flow that combines the CUDA programming model from Nvidia with the state of the art high-level synthesis tool AutoPilot from AutoESL, to...
Alexandros Papakonstantinou, Karthik Gururaj, John...
SAC
2006
ACM
14 years 1 months ago
Operating system multilevel load balancing
This paper describes an algorithm that allows Linux to perform multilevel load balancing in NUMA computers. The Linux scheduler implements a load balancing algorithm that uses str...
Mônica Corrêa, Avelino Francisco Zorzo...
SIGCSE
2005
ACM
146views Education» more  SIGCSE 2005»
14 years 1 months ago
Using image processing projects to teach CS1 topics
As Computer Science educators, we know that students learn more from projects that are fun and challenging, that seem “real” to them, and that allow them to be creative in des...
Richard Wicentowski, Tia Newhall