Sciweavers

84 search results - page 6 / 17
» Loop Distribution and Fusion with Timing and Code Size Optim...
Sort
View
ICPPW
2008
IEEE
14 years 1 months ago
Performance Analysis and Optimization of Parallel Scientific Applications on CMP Cluster Systems
Chip multiprocessors (CMP) are widely used for high performance computing. Further, these CMPs are being configured in a hierarchical manner to compose a node in a cluster system....
Xingfu Wu, Valerie E. Taylor, Charles W. Lively, S...
PDP
2006
IEEE
14 years 1 months ago
A Single-Loop Approach to SIMD Parallelization of 2-D Wavelet Lifting
Widespread use of wavelet transforms as in JPEG2000 demands efficient implementations on general purpose computers as well as dedicated hardware. The increasing availability of S...
Rade Kutil
IPPS
1999
IEEE
13 years 11 months ago
Cascaded Execution: Speeding Up Unparallelized Execution on Shared-Memory Multiprocessors
Both inherently sequential code and limitations of analysis techniques prevent full parallelization of many applications by parallelizing compilers. Amdahl's Law tells us tha...
Ruth E. Anderson, Thu D. Nguyen, John Zahorjan
TCAD
1998
107views more  TCAD 1998»
13 years 7 months ago
Optimizing dominant time constant in RC circuits
— Conventional methods for optimal sizing of wires and transistors use linear resistor-capacitor (RC) circuit models and the Elmore delay as a measure of signal delay. If the RC ...
Lieven Vandenberghe, Stephen P. Boyd, Abbas A. El ...
ICS
2009
Tsinghua U.
14 years 2 months ago
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
Jiayuan Meng, Kevin Skadron