Sciweavers

2932 search results - page 82 / 587
» Optimizing Memory System Performance for Communication in Pa...
Sort
View
CODES
2006
IEEE
14 years 2 months ago
Integrated analysis of communicating tasks in MPSoCs
Predicting timing behavior is key to efficient embedded real-time system design and verification. Especially memory accesses and co-processor calls over shared communication net...
Simon Schliecker, Matthias Ivers, Rolf Ernst
IPPS
2000
IEEE
14 years 13 days ago
PDRS: A Performance Data Representation System
We present the design and development of a Performance Data Representation System (PDRS) for scalable parallel computing. PDRS provides decision support that helps users find the r...
Xian-He Sun, Xingfu Wu
EUROPAR
2010
Springer
13 years 9 months ago
Optimized Dense Matrix Multiplication on a Many-Core Architecture
Abstract. Traditional parallel programming methodologies for improving performance assume cache-based parallel systems. However, new architectures, like the IBM Cyclops-64 (C64), b...
Elkin Garcia, Ioannis E. Venetis, Rishi Khan, Guan...
WOMPAT
2001
Springer
14 years 14 days ago
CableS : Thread Control and Memory System Extensions for Shared Virtual Memory Clusters
Clusters of high-end workstations and PCs are currently used in many application domains to perform large-scale computations or as scalable servers for I/O bound tasks. Although cl...
Peter Jamieson, Angelos Bilas
CSE
2011
IEEE
12 years 7 months ago
Performance Modeling of Hybrid MPI/OpenMP Scientific Applications on Large-scale Multicore Cluster Systems
In this paper, we present a performance modeling framework based on memory bandwidth contention time and a parameterized communication model to predict the performance of OpenMP, M...
Xingfu Wu, Valerie E. Taylor