Sciweavers

440 search results - page 56 / 88
» Merging Parallel Simulation Programs
Sort
View
164
Voted
IPPS
2010
IEEE
15 years 1 months ago
A GPU-inspired soft processor for high-throughput acceleration
There is building interest in using FPGAs as accelerators for high-performance computing, but existing systems for programming them are so far inadequate. In this paper we propose...
Jeffrey Kingyens, J. Gregory Steffan
158
Voted
ICPP
1998
IEEE
15 years 8 months ago
A memory-layout oriented run-time technique for locality optimization
Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout ...
Yong Yan, Xiaodong Zhang, Zhao Zhang
119
Voted
EUROPAR
2009
Springer
15 years 10 months ago
High Performance Matrix Multiplication on Many Cores
Moore’s Law suggests that the number of processing cores on a single chip increases exponentially. The future performance increases will be mainly extracted from thread-level par...
Nan Yuan, Yongbin Zhou, Guangming Tan, Junchao Zha...
151
Voted
ICS
2009
Tsinghua U.
15 years 10 months ago
Dynamic topology aware load balancing algorithms for molecular dynamics applications
Molecular Dynamics applications enhance our understanding of biological phenomena through bio-molecular simulations. Large-scale parallelization of MD simulations is challenging b...
Abhinav Bhatele, Laxmikant V. Kalé, Sameer ...
132
Voted
IISWC
2008
IEEE
15 years 10 months ago
Characterizing and improving the performance of Intel Threading Building Blocks
Abstract— The Intel Threading Building Blocks (TBB) runtime library [1] is a popular C++ parallelization environment [2][3] that offers a set of methods and templates for creatin...
Gilberto Contreras, Margaret Martonosi