Sciweavers

307 search results - page 26 / 62
» Automated Performance Measurement of Parallel Programs
Sort
View
AMAST
2008
Springer
13 years 10 months ago
System Demonstration of Spiral: Generator for High-Performance Linear Transform Libraries
We demonstrate Spiral, a domain-specific library generation system. Spiral generates high performance source code for linear transforms (such as the discrete Fourier transform and ...
Yevgen Voronenko, Franz Franchetti, Fréd&ea...
ICS
1999
Tsinghua U.
14 years 27 days ago
Eliminating synchronization bottlenecks in object-based programs using adaptive replication
This paper presents a technique, adaptive replication, for automatically eliminating synchronization bottlenecks in multithreaded programs that perform atomic operations on object...
Martin C. Rinard, Pedro C. Diniz
FCCM
2011
IEEE
331views VLSI» more  FCCM 2011»
13 years 11 days ago
Synthesis of Platform Architectures from OpenCL Programs
—The problem of automatically generating hardware modules from a high level representation of an application has been at the research forefront in the last few years. In this pap...
Muhsen Owaida, Nikolaos Bellas, Konstantis Dalouka...
CLUSTER
2007
IEEE
14 years 3 months ago
Balancing productivity and performance on the cell broadband engine
— The Cell Broadband Engine (BE) is a heterogeneous multicore processor, combining a general-purpose POWER architecture core with eight independent single-instructionmultiple-dat...
Sadaf R. Alam, Jeremy S. Meredith, Jeffrey S. Vett...
PPOPP
2006
ACM
14 years 2 months ago
Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster
Recently, the high-performance computing community has realized that power is a performance-limiting factor. One reason for this is that supercomputing centers have limited power ...
Robert Springer, David K. Lowenthal, Barry Rountre...