Sciweavers

572 search results - page 29 / 115
» A Performance Prediction Methodology for Data-dependent Para...
Sort
View
ICS
1998
Tsinghua U.
14 years 1 months ago
Load Execution Latency Reduction
In order to achieve high performance, contemporary microprocessors must effectively process the four major instruction types: ALU, branch, load, and store instructions. This paper...
Bryan Black, Brian Mueller, Stephanie Postal, Ryan...
IPPS
1996
IEEE
14 years 29 days ago
Commutativity Analysis: A Technique for Automatically Parallelizing Pointer-Based Computations
This paper introduces an analysis technique, commutativity analysis, for automatically parallelizing computations that manipulate dynamic, pointer-based data structures. Commutati...
Martin C. Rinard, Pedro C. Diniz
IPCCC
2007
IEEE
14 years 3 months ago
Application Insight Through Performance Modeling
Tuning the performance of applications requires understanding the interactions between code and target architecture. This paper describes a performance modeling approach that not ...
Gabriel Marin, John M. Mellor-Crummey
APPT
2009
Springer
14 years 3 months ago
Computational Performance of a Parallelized Three-Dimensional High-Order Spectral Element Toolbox
In this paper, a comprehensive performance review of an MPI-based high-order three-dimensional spectral element method C++ toolbox is presented. The focus is put on the performance...
Christoph Bosshard, Roland Bouffanais, Christian C...
CODES
2009
IEEE
13 years 10 months ago
A scalable parallel H.264 decoder on the cell broadband engine architecture
The H.264 video codec provides exceptional video compression while imposing dramatic increases in computational complexity over previous standards. While exploiting parallelism in...
Michael A. Baker, Pravin Dalale, Karam S. Chatha, ...