Sciweavers

262 search results - page 39 / 53
» Tilings as a programming exercise
Sort
View
SC
2009
ACM
14 years 4 months ago
Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems
This paper presents a dynamic task scheduling approach to executing dense linear algebra algorithms on multicore systems (either shared-memory or distributed-memory). We use a tas...
Fengguang Song, Asim YarKhan, Jack Dongarra
NOCS
2009
IEEE
14 years 4 months ago
Silicon-photonic clos networks for global on-chip communication
Future manycore processors will require energyefficient, high-throughput on-chip networks. Siliconphotonics is a promising new interconnect technology which offers lower power, h...
Ajay Joshi, Christopher Batten, Yong-Jin Kwon, Sco...
IEEEPACT
2009
IEEE
14 years 4 months ago
Automatic Tuning of Discrete Fourier Transforms Driven by Analytical Modeling
—Analytical models have been used to estimate optimal values for parameters such as tile sizes in the context of loop nests. However, important algorithms such as fast Fourier tr...
Basilio B. Fraguela, Yevgen Voronenko, Markus P&uu...
ISHPC
2003
Springer
14 years 3 months ago
Code and Data Transformations for Improving Shared Cache Performance on SMT Processors
Simultaneous multithreaded processors use shared on-chip caches, which yield better cost-performance ratios. Sharing a cache between simultaneously executing threads causes excessi...
Dimitrios S. Nikolopoulos
ICS
1999
Tsinghua U.
14 years 2 months ago
Nonlinear array layouts for hierarchical memory systems
Programming languages that provide multidimensional arrays and a flat linear model of memory must implement a mapping between these two domains to order array elements in memory....
Siddhartha Chatterjee, Vibhor V. Jain, Alvin R. Le...