Sciweavers

PPOPP
2010
ACM
13 years 9 months ago
Analyzing lock contention in multithreaded applications
Many programs exploit shared-memory parallelism using multithreading. Threaded codes typically use locks to coordinate access to shared data. In many cases, contention for locks r...
Nathan R. Tallent, John M. Mellor-Crummey, Allan P...
PPOPP
2010
ACM
14 years 1 months ago
Helper locks for fork-join parallel programming
Helper locks allow programs with large parallel critical sections, called parallel regions, to execute more efficiently by enlisting processors that might otherwise be waiting on ...
Kunal Agrawal, Charles E. Leiserson, Jim Sukha
PPOPP
2010
ACM
14 years 6 months ago
Thread to strand binding of parallel network applications in massive multi-threaded systems
In processors with several levels of hardware resource sharing, like CMPs in which each core is an SMT, the scheduling process becomes more complex than in processors with a singl...
Petar Radojkovic, Vladimir Cakarevic, Javier Verd&...
PPOPP
2010
ACM
14 years 6 months ago
An adaptive performance modeling tool for GPU architectures
This paper presents an analytical model to predict the performance of general-purpose applications on a GPU architecture. The model is designed to provide performance information ...
Sara S. Baghsorkhi, Matthieu Delahaye, Sanjay J. P...
PPOPP
2010
ACM
14 years 6 months ago
Load balancing on speed
To fully exploit multicore processors, applications are expected to provide a large degree of thread-level parallelism. While adequate for low core counts and their typical worklo...
Steven Hofmeyr, Costin Iancu, Filip Blagojevic
PPOPP
2010
ACM
14 years 8 months ago
Structure-driven optimizations for amorphous data-parallel programs
Irregular algorithms are organized around pointer-based data structures such as graphs and trees, and they are ubiquitous in applications. Recent work by the Galois project has pr...
Mario Méndez-Lojo, Donald Nguyen, Dimitrios...
PPOPP
2010
ACM
14 years 8 months ago
Applying the concurrent collections programming model to asynchronous parallel dense linear algebra
This poster is a case study on the application of a novel programming model, called Concurrent Collections (CnC), to the implementation of an asynchronous-parallel algorithm for c...
Aparna Chandramowlishwaran, Kathleen Knobe, Richar...
PPOPP
2010
ACM
14 years 8 months ago
Scheduling support for transactional memory contention management
Transactional Memory (TM) is considered as one of the most promising paradigms for developing concurrent applications. TM has been shown to scale well on multiple cores when the d...
Walther Maldonado, Patrick Marlier, Pascal Felber,...
PPOPP
2010
ACM
14 years 8 months ago
Model-driven autotuning of sparse matrix-vector multiply on GPUs
We present a performance model-driven framework for automated performance tuning (autotuning) of sparse matrix-vector multiply (SpMV) on systems accelerated by graphics processing...
Jee W. Choi, Amik Singh, Richard W. Vuduc