PPOPP 2016 | Sciweavers

41

PPOPP
2016
ACM

62views Distributed and Parallel Com...» more PPOPP 2016»

Auto-vectorizing a large-scale production unstructured-mesh CFD application

8 years 8 months ago

For modern x86 based CPUs with increasingly longer vector lengths, achieving good vectorization has become very important for gaining higher performance. Using very explicit SIMD ...

Gihan R. Mudalige, I. Z. Reguly, Michael B. Giles

claim paper

Read More »

35

click to vote

PPOPP
2016
ACM

80views Distributed and Parallel Com...» more PPOPP 2016»

High performance model based image reconstruction

8 years 8 months ago

Download sc15.supercomputing.org

In Computed Tomography (CT) methods, Model Based Iterative Reconstruction (MBIR) produces higher quality images than commonly used Filtered Backprojection (FBP) but at a very high...

Xiao Wang, Amit Sabne, Sherman J. Kisner, Anand Ra...

claim paper

Read More »

42

click to vote

PPOPP
2016
ACM

67views Distributed and Parallel Com...» more PPOPP 2016»

Declarative coordination of graph-based parallel programs

8 years 8 months ago

Download www.dcc.fc.up.pt

Declarative programming has been hailed as a promising approach to parallel programming since it makes it easier to reason about programs while hiding the implementation details o...

Flávio Cruz, Ricardo Rocha, Seth Copen Gold...

claim paper

Read More »

35

click to vote

PPOPP
2016
ACM

60views Distributed and Parallel Com...» more PPOPP 2016»

Benchmarking weak memory models

8 years 8 months ago

Download www.cs.kent.ac.uk

To achieve good multi-core performance, modern microprocessors have weak memory models, rather than enforce sequential consistency. This gives the programmer a wide scope for choo...

Carl G. Ritson, Scott Owens

claim paper

Read More »

42

click to vote

PPOPP
2016
ACM

84views Distributed and Parallel Com...» more PPOPP 2016»

Be my guest: MCS lock now welcomes guests

8 years 8 months ago

Download www.cs.toronto.edu

The MCS lock is one of the most prevalent queuing locks. It provides fair scheduling and high performance on massively parallel systems. However, the MCS lock mandates a bring-you...

Tianzheng Wang, Milind Chabbi, Hideaki Kimura

claim paper

Read More »

42

click to vote

PPOPP
2016
ACM

100views Distributed and Parallel Com...» more PPOPP 2016»

OPR: deterministic group replay for one-sided communication

8 years 8 months ago

Download crd.lbl.gov

Xuehai Qian, Koushik Sen, Paul Hargrove, Costin Ia...

claim paper

Read More »

44

click to vote

PPOPP
2016
ACM

77views Distributed and Parallel Com...» more PPOPP 2016»

Accelerating Dynamic Data Race Detection Using Static Thread Interference Analysis

8 years 8 months ago

Download www.cse.unsw.edu.au

Precise dynamic race detectors report an error if and only if more than one thread concurrently exhibits conﬂict on a memory access. They insert instrumentations at compiletime ...

Peng Di, Yulei Sui

claim paper

Read More »

34

click to vote

PPOPP
2016
ACM

79views Distributed and Parallel Com...» more PPOPP 2016»

Lease/release: architectural support for scaling contended data structures

8 years 8 months ago

Download supertech.csail.mit.edu

High memory contention is generally agreed to be a worst-case scenario for concurrent data structures. There has been a signiﬁcant amount of research effort spent investigating ...

Syed Kamran Haider, William Hasenplaugh, Dan Alist...

claim paper

Read More »

37

click to vote

PPOPP
2016
ACM

78views Distributed and Parallel Com...» more PPOPP 2016»

Optimistic concurrency with OPTIK

8 years 8 months ago

Download infoscience.epfl.ch

We introduce OPTIK, a new practical design pattern for designing and implementing fast and scalable concurrent data structures. OPTIK relies on the commonly-used technique of vers...

Rachid Guerraoui, Vasileios Trigonakis

claim paper

Read More »

34

click to vote

PPOPP
2016
ACM

85views Distributed and Parallel Com...» more PPOPP 2016»

Performance portable GPU code generation for matrix multiplication

8 years 8 months ago

Download homepages.inf.ed.ac.uk

Parallel accelerators such as GPUs are notoriously hard to program; exploiting their full performance potential is a job best left for ninja programmers. High-level programming la...

Toomas Remmelg, Thibaut Lutz, Michel Steuwer, Chri...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers