Sciweavers

380 search results - page 64 / 76
» Cache-oblivious simulation of parallel programs
Sort
View
ICS
2010
Tsinghua U.
13 years 9 months ago
Timing local streams: improving timeliness in data prefetching
Data prefetching technique is widely used to bridge the growing performance gap between processor and memory. Numerous prefetching techniques have been proposed to exploit data pa...
Huaiyu Zhu, Yong Chen, Xian-He Sun
IPPS
2003
IEEE
14 years 20 days ago
Extending OpenMP to Support Slipstream Execution Mode
OpenMP has emerged as a widely accepted standard for writing shared memory programs. Hardware-specific extensions such as data placement are usually needed to improve the scalabi...
Khaled Z. Ibrahim, Gregory T. Byrd
ASPLOS
2000
ACM
13 years 11 months ago
An Analysis of Operating System Behavior on a Simultaneous Multithreaded Architecture
This paper presents the first analysis of operating system execution on a simultaneous multithreaded (SMT) processor. While SMT has been studied extensively over the past 6 years,...
Joshua Redstone, Susan J. Eggers, Henry M. Levy
FCCM
2006
IEEE
133views VLSI» more  FCCM 2006»
14 years 1 months ago
A Scalable FPGA-based Multiprocessor
It has been shown that a small number of FPGAs can significantly accelerate certain computing tasks by up to two or three orders of magnitude. However, particularly intensive lar...
Arun Patel, Christopher A. Madill, Manuel Salda&nt...
ISCA
2011
IEEE
238views Hardware» more  ISCA 2011»
12 years 11 months ago
Rebound: scalable checkpointing for coherent shared memory
As we move to large manycores, the hardware-based global checkpointing schemes that have been proposed for small shared-memory machines do not scale. Scalability barriers include ...
Rishi Agarwal, Pranav Garg, Josep Torrellas