This site uses cookies to deliver our services and to ensure you get the best experience. By continuing to use this site, you consent to our use of cookies and acknowledge that you have read and understand our Privacy Policy, Cookie Policy, and Terms
Data locality and synchronization overhead are two important factors that affect the performance of applications on multiprocessors. Loop fusion is an effective way for reducing s...
Edwin Hsing-Mean Sha, Chenhua Lang, Nelson L. Pass...
Abstract - Three dimensional computed tomography is a computationally intensive procedure, requiring large amounts of R A M and processing power. Parallel methods for two dimension...
David A. Reimann, Vipin Chaudhary, Michael J. Flyn...
: We study the scalability of 2-D discrete wavelet transform algorithms on fine-grained parallel architectures. The principal operation in the 2-D DWT is the filtering operation us...
Jamshed N. Patel, Ashfaq A. Khokhar, Leah H. Jamie...
Tiling exploits temporal reuse carried by an outer loop of a loop nest to enhance cache locality. Loop skewing is typically required to make tiling legal. This restricts parallelis...
A primary problem in the performance measurement of high-level parallel programming languages is to map lowlevel events to high-level programming constructs. We discuss several as...
In s-to-p broadcasting, s processors in a p-processor machine contain a message to be broadcast to all the processors, 1 s p. We present a number of different broadcasting algori...
In this workshop session, three speakers present their viewpoints and contributions to the topic of portable parallel programming languages. They are Dennis Gannon from Indiana Un...