Sciweavers

307 search results - page 48 / 62
» Automated Performance Measurement of Parallel Programs
Sort
View
SIGMOD
2008
ACM
140views Database» more  SIGMOD 2008»
14 years 8 months ago
Relational joins on graphics processors
We present a novel design and implementation of relational join algorithms for new-generation graphics processing units (GPUs). The most recent GPU features include support for wr...
Bingsheng He, Ke Yang, Rui Fang, Mian Lu, Naga K. ...
IPPS
2005
IEEE
14 years 2 months ago
MOCCA - Towards a Distributed CCA Framework for Metacomputing
— We describe the design and implementation of MOCCA, a distributed CCA framework implemented using the H2O metacomputing system. Motivated by the quest for appropriate metasyste...
Maciej Malawski, Dawid Kurzyniec, Vaidy S. Sundera...
ESA
1998
Springer
162views Algorithms» more  ESA 1998»
14 years 6 days ago
External Memory Algorithms
Abstract. Data sets in large applications are often too massive to t completely inside the computer's internal memory. The resulting input output communication or I O between ...
Jeffrey Scott Vitter
PPOPP
2010
ACM
13 years 7 months ago
Analyzing lock contention in multithreaded applications
Many programs exploit shared-memory parallelism using multithreading. Threaded codes typically use locks to coordinate access to shared data. In many cases, contention for locks r...
Nathan R. Tallent, John M. Mellor-Crummey, Allan P...
IEEEPACT
2009
IEEE
14 years 3 months ago
Automatic Tuning of Discrete Fourier Transforms Driven by Analytical Modeling
—Analytical models have been used to estimate optimal values for parameters such as tile sizes in the context of loop nests. However, important algorithms such as fast Fourier tr...
Basilio B. Fraguela, Yevgen Voronenko, Markus P&uu...