Sciweavers

347 search results - page 67 / 70
» Caching processor general registers
Sort
View
CASES
2006
ACM
14 years 1 months ago
Reaching fast code faster: using modeling for efficient software thread integration on a VLIW DSP
When integrating software threads together to boost performance on a processor with instruction-level parallel processing support, it is rarely clear which code regions should be ...
Won So, Alexander G. Dean
FPGA
2010
ACM
232views FPGA» more  FPGA 2010»
13 years 10 months ago
High-throughput bayesian computing machine with reconfigurable hardware
We use reconfigurable hardware to construct a high throughput Bayesian computing machine (BCM) capable of evaluating probabilistic networks with arbitrary DAG (directed acyclic gr...
Mingjie Lin, Ilia Lebedev, John Wawrzynek
ASPLOS
2008
ACM
13 years 11 months ago
Communication optimizations for global multi-threaded instruction scheduling
The recent shift in the industry towards chip multiprocessor (CMP) designs has brought the need for multi-threaded applications to mainstream computing. As observed in several lim...
Guilherme Ottoni, David I. August
IEEEPACT
2008
IEEE
14 years 4 months ago
Scalable and reliable communication for hardware transactional memory
In a hardware transactional memory system with lazy versioning and lazy conflict detection, the process of transaction commit can emerge as a bottleneck. This is especially true ...
Seth H. Pugsley, Manu Awasthi, Niti Madan, Naveen ...
ICS
2007
Tsinghua U.
14 years 4 months ago
Optimization of data prefetch helper threads with path-expression based statistical modeling
This paper investigates helper threads that improve performance by prefetching data on behalf of an application’s main thread. The focus is data prefetch helper threads that lac...
Tor M. Aamodt, Paul Chow