- Among all software cache coherence strategaes, the ones that are based on the concept of tamestamps show the greatest potentaal an terms of cache performance. The early tamestamp...
Data locality and synchronization overhead are two important factors that affect the performance of applications on multiprocessors. Loop fusion is an effective way for reducing s...
Edwin Hsing-Mean Sha, Chenhua Lang, Nelson L. Pass...
Abstract - Three dimensional computed tomography is a computationally intensive procedure, requiring large amounts of R A M and processing power. Parallel methods for two dimension...
David A. Reimann, Vipin Chaudhary, Michael J. Flyn...
: We study the scalability of 2-D discrete wavelet transform algorithms on fine-grained parallel architectures. The principal operation in the 2-D DWT is the filtering operation us...
Jamshed N. Patel, Ashfaq A. Khokhar, Leah H. Jamie...
Tiling exploits temporal reuse carried by an outer loop of a loop nest to enhance cache locality. Loop skewing is typically required to make tiling legal. This restricts parallelis...
A primary problem in the performance measurement of high-level parallel programming languages is to map lowlevel events to high-level programming constructs. We discuss several as...