Sciweavers

1001 search results - page 39 / 201
» Improving memory hierarchy performance for irregular applica...
Sort
View
171
Voted
FPGA
2012
ACM
285views FPGA» more  FPGA 2012»
14 years 1 months ago
Optimizing SDRAM bandwidth for custom FPGA loop accelerators
Memory bandwidth is critical to achieving high performance in many FPGA applications. The bandwidth of SDRAM memories is, however, highly dependent upon the order in which address...
Samuel Bayliss, George A. Constantinides
IPPS
2006
IEEE
16 years 3 days ago
On the performance of parallel normalized explicit preconditioned conjugate gradient type methods
A new class of parallel normalized preconditioned conjugate gradient type methods in conjunction with normalized approximate inverses algorithms, based on normalized approximate f...
George A. Gravvanis, Konstantinos M. Giannoutakis
PPOPP
1997
ACM
15 years 9 months ago
Shared Memory Performance Profiling
This paper describes a new approach to finding performance bottlenecks in shared-memory parallel programs and its embodiment in the Paradyn Parallel Performance Tools running with...
Zhichen Xu, James R. Larus, Barton P. Miller
ACMMSP
2005
ACM
129views Hardware» more  ACMMSP 2005»
15 years 11 months ago
A locality-improving dynamic memory allocator
In general-purpose applications, most data is dynamically allocated. The memory manager therefore plays a crucial role in application performance by determining the spatial locali...
Yi Feng, Emery D. Berger
ASPLOS
1998
ACM
15 years 10 months ago
Performance of Database Workloads on Shared-Memory Systems with Out-of-Order Processors
Database applications such as online transaction processing (OLTP) and decision support systems (DSS) constitute the largest and fastest-growing segment of the market for multipro...
Parthasarathy Ranganathan, Kourosh Gharachorloo, S...