Sciweavers

26 search results - page 4 / 6
» Loop Optimization using Hierarchical Compilation and Kernel ...
Sort
View
LCPC
2004
Springer
14 years 27 days ago
Empirical Performance-Model Driven Data Layout Optimization
Abstract. Empirical optimizers like ATLAS have been very effective in optimizing computational kernels in libraries. The best choice of parameters such as tile size and degree of l...
Qingda Lu, Xiaoyang Gao, Sriram Krishnamoorthy, Ge...
IPPS
2002
IEEE
14 years 13 days ago
Effective Cross-Platform, Multilevel Parallelism via Dynamic Adaptive Execution
This paper presents preliminary efforts to develop compilation and execution environments that achieve performance portability of multilevel parallelization on hierarchical archit...
Walden Ko, Mark N. Yankelevsky, Dimitrios S. Nikol...
CF
2009
ACM
14 years 2 months ago
Mapping the LU decomposition on a many-core architecture: challenges and solutions
Recently, multi-core architectures with alternative memory subsystem designs have emerged. Instead of using hardwaremanaged cache hierarchies, they employ software-managed embedde...
Ioannis E. Venetis, Guang R. Gao
IPPS
2007
IEEE
14 years 1 months ago
POET: Parameterized Optimizations for Empirical Tuning
The excessive complexity of both machine architectures and applications have made it difficult for compilers to statically model and predict application behavior. This observatio...
Qing Yi, Keith Seymour, Haihang You, Richard W. Vu...
CF
2005
ACM
13 years 9 months ago
A case for a working-set-based memory hierarchy
Modern microprocessor designs continue to obtain impressive performance gains through increasing clock rates and advances in the parallelism obtained via micro-architecture design...
Steve Carr, Soner Önder