Sciweavers

244 search results - page 43 / 49
» Optimizing Loop Performance for Clustered VLIW Architectures
Sort
View
SBACPAD
2003
IEEE
106views Hardware» more  SBACPAD 2003»
14 years 1 months ago
A Parallel Implementation of the LTSn Method for a Radiative Transfer Problem
— A radiative transfer solver that implements the LTSn method was optimized and parallelized using the MPI message passing communication library. Timing and profiling informatio...
Roberto P. Souto, Haroldo F. de Campos Velho, Step...
ICDE
2005
IEEE
122views Database» more  ICDE 2005»
14 years 9 months ago
Uncovering Database Access Optimizations in the Middle Tier with TORPEDO
A popular architecture for enterprise applications is one of a stateless object-based server accessing persistent data through Object-Relational mapping software. The reported ben...
Bruce E. Martin
DAC
2003
ACM
14 years 8 months ago
Distributed sleep transistor network for power reduction
Sleep transistors are effective to reduce dynamic and leakage power. The cluster-based design was proposed to reduce the sleep transistor area by clustering gates to minimize the ...
Changbo Long, Lei He
CGO
2010
IEEE
14 years 2 months ago
Automatic creation of tile size selection models
Tiling is a widely used loop transformation for exposing/exploiting parallelism and data locality. Effective use of tiling requires selection and tuning of the tile sizes. This is...
Tomofumi Yuki, Lakshminarayanan Renganarayanan, Sa...
HPCA
2007
IEEE
14 years 8 months ago
Illustrative Design Space Studies with Microarchitectural Regression Models
We apply a scalable approach for practical, comprehensive design space evaluation and optimization. This approach combines design space sampling and statistical inference to ident...
Benjamin C. Lee, David M. Brooks