Sciweavers

260 search results - page 3 / 52
» Performance Modelling and Optimization of Memory Access on C...
Sort
View
CLUSTER
2003
IEEE
14 years 24 days ago
Improving the Performance of MPI Derived Datatypes by Optimizing Memory-Access Cost
The MPI Standard supports derived datatypes, which allow users to describe noncontiguous memory layout and communicate noncontiguous data with a single communication function. Thi...
Surendra Byna, William D. Gropp, Xian-He Sun, Raje...
ISCA
1995
IEEE
118views Hardware» more  ISCA 1995»
13 years 11 months ago
The EM-X Parallel Computer: Architecture and Basic Performance
Latency tolerance is essential in achieving high performance on parallel computers for remote function calls and fine-grained remote memory accesses. EM-X supports interprocessor ...
Yuetsu Kodama, Hirohumi Sakane, Mitsuhisa Sato, Ha...
3DIC
2009
IEEE
169views Hardware» more  3DIC 2009»
14 years 18 days ago
3-D memory organization and performance analysis for multi-processor network-on-chip architecture
Several forms of processor memory organizations have been in use to optimally access off-chip memory systems mainly the Hard disk drives (HDD). Recent trends show that the solid s...
Awet Yemane Weldezion, Zhonghai Lu, Roshan Weerase...
EUROPAR
2001
Springer
13 years 12 months ago
Performance of High-Accuracy PDE Solvers on a Self-Optimizing NUMA Architecture
High-accuracy PDE solvers use multi-dimensional fast Fourier transforms. The FFTs exhibits a static and structured memory access pattern which results in a large amount of communic...
Sverker Holmgren, Dan Wallin
SPAA
2010
ACM
14 years 9 days ago
Towards optimizing energy costs of algorithms for shared memory architectures
Energy consumption by computer systems has emerged as an important concern. However, the energy consumed in executing an algorithm cannot be inferred from its performance alone: i...
Vijay Anand Korthikanti, Gul Agha