Sciweavers

309 search results - page 4 / 62
» Parallel Memory Architecture for Arbitrary Stride Accesses
Sort
View
MICRO
2010
IEEE
143views Hardware» more  MICRO 2010»
13 years 2 months ago
SD3: A Scalable Approach to Dynamic Data-Dependence Profiling
Abstract--As multicore processors are deployed in mainstream computing, the need for software tools to help parallelize programs is increasing dramatically. Data-dependence profili...
Minjang Kim, Hyesoon Kim, Chi-Keung Luk
HICSS
1995
IEEE
109views Biometrics» more  HICSS 1995»
13 years 11 months ago
The architecture of an optimistic CPU: the WarpEngine
The architecture for a shared memory CPU is described. The CPU allows for parallelism down to the level of single instructions and is tolerant of memory latency. All executable in...
John G. Cleary, Murray Pearson, Husam Kinawi
ICPP
2009
IEEE
13 years 5 months ago
A Resource Optimized Remote-Memory-Access Architecture for Low-latency Communication
This paper introduces a new highly optimized architecture for remote memory access (RMA). RMA, using put and get operations, is a one-sided communication function which amongst ot...
Mondrian Nüssle, Martin Scherer, Ulrich Br&uu...
NPC
2005
Springer
14 years 29 days ago
Performance Modelling and Optimization of Memory Access on Cellular Computer Architecture Cyclops64
This paper focuses on the Cyclops64 computer architecture and presents an analytical model and performance simulation results for the preloading and loop unrolling approaches to op...
Yanwei Niu, Ziang Hu, Kenneth E. Barner, Guang R. ...
MICRO
2009
IEEE
147views Hardware» more  MICRO 2009»
14 years 2 months ago
Complexity effective memory access scheduling for many-core accelerator architectures
Modern DRAM systems rely on memory controllers that employ out-of-order scheduling to maximize row access locality and bank-level parallelism, which in turn maximizes DRAM bandwid...
George L. Yuan, Ali Bakhoda, Tor M. Aamodt