Sciweavers

108 search results - page 20 / 22
» Achieving efficient register allocation via parallelism
Sort
View
TOG
2008
106views more  TOG 2008»
13 years 6 months ago
BSGP: bulk-synchronous GPU programming
We present BSGP, a new programming language for general purpose computation on the GPU. A BSGP program looks much the same as a sequential C program. Programmers only need to supp...
Qiming Hou, Kun Zhou, Baining Guo
MICRO
2003
IEEE
143views Hardware» more  MICRO 2003»
14 years 7 days ago
VSV: L2-Miss-Driven Variable Supply-Voltage Scaling for Low Power
Energy-efficient processor design is becoming more and more important with technology scaling and with high performance requirements. Supply-voltage scaling is an efficient way to...
Hai Li, Chen-Yong Cher, T. N. Vijaykumar, Kaushik ...
CCGRID
2010
IEEE
13 years 7 months ago
FaReS: Fair Resource Scheduling for VMM-Bypass InfiniBand Devices
In order to address the high performance I/O needs of HPC and enterprise applications, modern interconnection fabrics, such as InfiniBand and more recently, 10GigE, rely on network...
Adit Ranadive, Ada Gavrilovska, Karsten Schwan
ICPP
2008
IEEE
14 years 1 months ago
Optimizing Issue Queue Reliability to Soft Errors on Simultaneous Multithreaded Architectures
The issue queue (IQ) is a key microarchitecture structure for exploiting instruction-level and thread-level parallelism in dynamically scheduled simultaneous multithreaded (SMT) p...
Xin Fu, Wangyuan Zhang, Tao Li, José A. B. ...
HPCA
2007
IEEE
14 years 7 months ago
Improving Branch Prediction and Predicated Execution in Out-of-Order Processors
If-conversion is a compiler technique that reduces the misprediction penalties caused by hard-to-predict branches, transforming control dependencies into data dependencies. Althou...
Eduardo Quiñones, Joan-Manuel Parcerisa, An...