Sciweavers

440 search results - page 56 / 88
» Merging Parallel Simulation Programs
Sort
View
IPPS
2010
IEEE
13 years 6 months ago
A GPU-inspired soft processor for high-throughput acceleration
There is building interest in using FPGAs as accelerators for high-performance computing, but existing systems for programming them are so far inadequate. In this paper we propose...
Jeffrey Kingyens, J. Gregory Steffan
ICPP
1998
IEEE
14 years 1 months ago
A memory-layout oriented run-time technique for locality optimization
Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout ...
Yong Yan, Xiaodong Zhang, Zhao Zhang
EUROPAR
2009
Springer
14 years 3 months ago
High Performance Matrix Multiplication on Many Cores
Moore’s Law suggests that the number of processing cores on a single chip increases exponentially. The future performance increases will be mainly extracted from thread-level par...
Nan Yuan, Yongbin Zhou, Guangming Tan, Junchao Zha...
ICS
2009
Tsinghua U.
14 years 3 months ago
Dynamic topology aware load balancing algorithms for molecular dynamics applications
Molecular Dynamics applications enhance our understanding of biological phenomena through bio-molecular simulations. Large-scale parallelization of MD simulations is challenging b...
Abhinav Bhatele, Laxmikant V. Kalé, Sameer ...
IISWC
2008
IEEE
14 years 3 months ago
Characterizing and improving the performance of Intel Threading Building Blocks
Abstract— The Intel Threading Building Blocks (TBB) runtime library [1] is a popular C++ parallelization environment [2][3] that offers a set of methods and templates for creatin...
Gilberto Contreras, Margaret Martonosi