Sciweavers

71 search results - page 9 / 15
» Improving memory bank-level parallelism in the presence of p...
Sort
View
IPPS
2009
IEEE
14 years 2 months ago
Exploiting DMA to enable non-blocking execution in Decoupled Threaded Architecture
DTA (Decoupled Threaded Architecture) is designed to exploit fine/medium grained Thread Level Parallelism (TLP) by using a distributed hardware scheduling unit and relying on exi...
Roberto Giorgi, Zdravko Popovic, Nikola Puzovic
EUROPAR
2003
Springer
14 years 18 days ago
Compression in Data Caches with Compressible Field Isolation for Recursive Data Structures
We introduce a software/hardware scheme called the Field Array Compression Technique (FACT) which reduces cache misses due to recursive data structures. Using a data layout transfo...
Masamichi Takagi, Kei Hiraki
CORR
2009
Springer
74views Education» more  CORR 2009»
13 years 5 months ago
Parallelizing Deadlock Resolution in Symbolic Synthesis of Distributed Programs
Previous work has shown that there are two major complexity barriers in the synthesis of fault-tolerant distributed programs, namely generation of fault-span, the set of states re...
Fuad Abujarad, Borzoo Bonakdarpour, Sandeep S. Kul...
IEEEPACT
2002
IEEE
14 years 9 days ago
Using the Compiler to Improve Cache Replacement Decisions
Memory performance is increasingly determining microprocessor performance and technology trends are exacerbating this problem. Most architectures use set-associative caches with L...
Zhenlin Wang, Kathryn S. McKinley, Arnold L. Rosen...
HPCA
2008
IEEE
14 years 7 months ago
Runahead Threads to improve SMT performance
In this paper, we propose Runahead Threads (RaT) as a valuable solution for both reducing resource contention and exploiting memory-level parallelism in Simultaneous Multithreaded...
Tanausú Ramírez, Alex Pajuelo, Olive...