Sciweavers

2932 search results - page 55 / 587
» Optimizing Memory System Performance for Communication in Pa...
Sort
View
SC
2005
ACM
14 years 1 months ago
A Scalable Distributed Parallel Breadth-First Search Algorithm on BlueGene/L
Many emerging large-scale data science applications require searching large graphs distributed across multiple memories and processors. This paper presents a distributed breadthï¬...
Andy Yoo, Edmond Chow, Keith W. Henderson, Will Mc...
SASP
2009
IEEE
222views Hardware» more  SASP 2009»
14 years 2 months ago
A memory optimization technique for software-managed scratchpad memory in GPUs
—With the appearance of massively parallel and inexpensive platforms such as the G80 generation of NVIDIA GPUs, more real-life applications will be designed or ported to these pl...
Maryam Moazeni, Alex A. T. Bui, Majid Sarrafzadeh
IPPS
2000
IEEE
14 years 11 days ago
Combining Fusion Optimizations and Piecewise Execution of Nested Data-Parallel Programs
Abstract. Nested data-parallel programs often have large memory requirements due to their high degree of parallelism. Piecewise execution is an implementation technique used to min...
W. Pfannenstiel
GROUP
1997
ACM
14 years 4 days ago
Bridging the gap between face-to-face communication and long-term collaboration
During the different phases of a project, stakeholders have different communication needs and make use of different communication media to satisfy them. A group memory system must...
Stefanie N. Lindstaedt, Kurt Schneider
IPPS
2008
IEEE
14 years 2 months ago
Modeling and predicting application performance on parallel computers using HPC challenge benchmarks
A method is presented for modeling application performance on parallel computers in terms of the performance of microkernels from the HPC Challenge benchmarks. Specifically, the a...
Wayne Pfeiffer, Nicholas J. Wright