Sciweavers

292 search results - page 4 / 59
» Benchmarks and performance analysis of decimal floating-poin...
Sort
View
PPOPP
2003
ACM
14 years 1 months ago
Using thread-level speculation to simplify manual parallelization
In this paper, we provide examples of how thread-level speculation (TLS) simplifies manual parallelization and enhances its performance. A number of techniques for manual parallel...
Manohar K. Prabhu, Kunle Olukotun
TC
2008
13 years 8 months ago
High-Performance Mixed-Precision Linear Solver for FPGAs
Compared to higher-precision data formats, lower-precision data formats result in higher performance for computationally intensive applications on FPGAs because of their lower res...
Junqing Sun, Gregory D. Peterson, Olaf O. Storaasl...
ANCS
2009
ACM
13 years 6 months ago
Design and performance analysis of a DRAM-based statistics counter array architecture
The problem of maintaining efficiently a large number (say millions) of statistics counters that need to be updated at very high speeds (e.g. 40 Gb/s) has received considerable re...
Haiquan (Chuck) Zhao, Hao Wang, Bill Lin, Jun (Jim...
IPPS
1998
IEEE
14 years 20 days ago
Optimizing Data Scheduling on Processor-in-Memory Arrays
In the study of PetaFlop project, Processor-In-Memory array was proposed to be a target architecture in achieving 1015 floating point operations per second computing performance. ...
Yi Tian, Edwin Hsing-Mean Sha, Chantana Chantrapor...
IPPS
2010
IEEE
13 years 6 months ago
Performance modeling of heterogeneous systems
Predicting how well applications may run on modern systems is becoming increasingly challenging. It is no longer sufficient to look at number of floating point operations and commu...
Jan Christian Meyer, Anne C. Elster