Sciweavers

436 search results - page 55 / 88
» Performance Optimization and Modeling of Blocked Sparse Kern...
Sort
View
PPOPP
2009
ACM
14 years 9 months ago
OpenMP to GPGPU: a compiler framework for automatic translation and optimization
GPGPUs have recently emerged as powerful vehicles for generalpurpose high-performance computing. Although a new Compute Unified Device Architecture (CUDA) programming model from N...
Seyong Lee, Seung-Jai Min, Rudolf Eigenmann
ISQED
2007
IEEE
151views Hardware» more  ISQED 2007»
14 years 3 months ago
Wavelet-Based Passivity Preserving Model Order Reduction for Wideband Interconnect Characterization
Model order reduction plays a key role in determining VLSI system performance and the optimization of interconnects. In this paper, we develop an accurate and provably passive met...
Mehboob Alam, Arthur Nieuwoudt, Yehia Massoud
MEMICS
2010
13 years 3 months ago
GPU-Based Sample-Parallel Context Modeling for EBCOT in JPEG2000
Embedded Block Coding with Optimal Truncation (EBCOT) is the fundamental and computationally very demanding part of the compression process of JPEG2000 image compression standard. ...
Jiri Matela, Vit Rusnak, Petr Holub
PLDI
2003
ACM
14 years 2 months ago
A comparison of empirical and model-driven optimization
Empirical program optimizers estimate the values of key optimization parameters by generating different program versions and running them on the actual hardware to determine which...
Kamen Yotov, Xiaoming Li, Gang Ren, Michael Cibuls...
DAC
2010
ACM
13 years 9 months ago
Instruction cache locking using temporal reuse profile
The performance of most embedded systems is critically dependent on the average memory access latency. Improving the cache hit rate can have significant positive impact on the per...
Yun Liang, Tulika Mitra