Sciweavers

955 search results - page 21 / 191
» Performance optimization of multiple memory architectures fo...
Sort
View
SPAA
2010
ACM
14 years 1 months ago
Towards optimizing energy costs of algorithms for shared memory architectures
Energy consumption by computer systems has emerged as an important concern. However, the energy consumed in executing an algorithm cannot be inferred from its performance alone: i...
Vijay Anand Korthikanti, Gul Agha
IWMM
2011
Springer
270views Hardware» more  IWMM 2011»
12 years 12 months ago
Memory management in NUMA multicore systems: trapped between cache contention and interconnect overhead
Multiprocessors based on processors with multiple cores usually include a non-uniform memory architecture (NUMA); even current 2-processor systems with 8 cores exhibit non-uniform...
Zoltan Majo, Thomas R. Gross
ISCA
1995
IEEE
118views Hardware» more  ISCA 1995»
14 years 16 days ago
The EM-X Parallel Computer: Architecture and Basic Performance
Latency tolerance is essential in achieving high performance on parallel computers for remote function calls and fine-grained remote memory accesses. EM-X supports interprocessor ...
Yuetsu Kodama, Hirohumi Sakane, Mitsuhisa Sato, Ha...
SASP
2009
IEEE
222views Hardware» more  SASP 2009»
14 years 3 months ago
A memory optimization technique for software-managed scratchpad memory in GPUs
—With the appearance of massively parallel and inexpensive platforms such as the G80 generation of NVIDIA GPUs, more real-life applications will be designed or ported to these pl...
Maryam Moazeni, Alex A. T. Bui, Majid Sarrafzadeh
EUROPAR
2001
Springer
14 years 1 months ago
Performance of High-Accuracy PDE Solvers on a Self-Optimizing NUMA Architecture
High-accuracy PDE solvers use multi-dimensional fast Fourier transforms. The FFTs exhibits a static and structured memory access pattern which results in a large amount of communic...
Sverker Holmgren, Dan Wallin