Sciweavers

120 search results - page 11 / 24
» Optimizing irregular shared-memory applications for distribu...
Sort
View
ICS
2009
Tsinghua U.
14 years 2 months ago
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
Jiayuan Meng, Kevin Skadron
CCGRID
2008
IEEE
14 years 2 months ago
Optimized Distributed Data Sharing Substrate in Multi-core Commodity Clusters: A Comprehensive Study with Applications
Distributed applications tend to have a complex design due to issues such as concurrency, synchronization and communication. Researchers in the past have proposed abstractions to ...
Karthikeyan Vaidyanathan, Ping Lai, Sundeep Narrav...
SNPD
2003
13 years 9 months ago
Incomplete Information Processing for Optimization of Distributed Applications
This paper focuses on non-strict processing, optimization, and partial evaluation of MPI programs which use incremental data structures (ISs). We describe the design and implement...
Alfredo Cristóbal-Salas, Andrei Tchernykh, ...
IPPS
2007
IEEE
14 years 1 months ago
A Comprehensive Analysis of OpenMP Applications on Dual-Core Intel Xeon SMPs
Hybrid chip multithreaded SMPs present new challenges as well as new opportunities to maximize performance. Our intention is to discover the optimal operating configuration of suc...
Ryan E. Grant, Ahmad Afsahi
IEEEPACT
2005
IEEE
14 years 1 months ago
Communication Optimizations for Fine-Grained UPC Applications
Global address space languages like UPC exhibit high performance and portability on a broad class of shared and distributed memory parallel architectures. The most scalable applic...
Wei-Yu Chen, Costin Iancu, Katherine A. Yelick