Sciweavers

70 search results - page 4 / 14
» Data parallel execution challenges and runtime performance o...
Sort
View
ICPP
1998
IEEE
13 years 11 months ago
A memory-layout oriented run-time technique for locality optimization
Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout ...
Yong Yan, Xiaodong Zhang, Zhao Zhang
EUROPAR
2009
Springer
13 years 11 months ago
StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures
Abstract. In the field of HPC, the current hardware trend is to design multiprocessor architectures that feature heterogeneous technologies such as specialized coprocessors (e.g., ...
Cédric Augonnet, Samuel Thibault, Raymond N...
PLDI
2012
ACM
11 years 10 months ago
Adaptive input-aware compilation for graphics engines
While graphics processing units (GPUs) provide low-cost and efficient platforms for accelerating high performance computations, the tedious process of performance tuning required...
Mehrzad Samadi, Amir Hormati, Mojtaba Mehrara, Jan...
HPCA
1998
IEEE
13 years 11 months ago
Performance Study of a Concurrent Multithreaded Processor
The performance of a concurrent multithreaded architectural model, called superthreading 15 , is studied in this paper. It tries to integrate optimizing compilation techniques and...
Jenn-Yuan Tsai, Zhenzhen Jiang, Eric Ness, Pen-Chu...
IPPS
2008
IEEE
14 years 1 months ago
Overcoming scaling challenges in biomolecular simulations across multiple platforms
NAMD† is a portable parallel application for biomolecular simulations. NAMD pioneered the use of hybrid spatial and force decomposition, a technique now used by most scalable pr...
Abhinav Bhatele, Sameer Kumar, Chao Mei, James C. ...