Sciweavers

PPOPP
2015
ACM
8 years 7 months ago
Exploiting communication concurrency on high performance computing systems
Although logically available, applications may not exploit enough instantaneous communication concurrency to maximize hardware utilization on HPC systems. This is exacerbated in h...
Nicholas Chaimov, Khaled Z. Ibrahim, Samuel Willia...
PPOPP
2015
ACM
8 years 7 months ago
GPU-SM: shared memory multi-GPU programming
Discrete GPUs in modern multi-GPU systems can transparently access each other’s memories through the PCIe interconnect. Future systems will improve this capability by including ...
Javier Cabezas, Marc Jordà, Isaac Gelado, N...
PPOPP
2015
ACM
8 years 7 months ago
Barrier elision for production parallel programs
Large scientific code bases are often composed of several layers of runtime libraries, implemented in multiple programming languages. In such situation, programmers often choose ...
Milind Chabbi, Wim Lavrijsen, Wibe de Jong, Koushi...
PPOPP
2015
ACM
8 years 7 months ago
The SprayList: a scalable relaxed priority queue
High-performance concurrent priority queues are essential for applications such as task scheduling and discrete event simulation. Unfortunately, even the best performing implement...
Dan Alistarh, Justin Kopinsky, Jerry Li, Nir Shavi...
PPOPP
2015
ACM
8 years 7 months ago
A performance study of Java garbage collectors on multicore architectures
In the last few years, managed runtime environments such as the Java Virtual Machine (JVM) are increasingly used on large-scale multicore servers. The garbage collector (GC) repre...
Maria Carpen Amarie, Patrick Marlier, Pascal Felbe...
PPOPP
2015
ACM
8 years 7 months ago
NUMA-aware graph-structured analytics
Graph-structured analytics has been widely adopted in a number of big data applications such as social computation, web-search and recommendation systems. Though much prior resear...
Kaiyuan Zhang, Rong Chen, Haibo Chen
PPOPP
2015
ACM
8 years 7 months ago
Optimization of asynchronous graph processing on GPU with hybrid coloring model
Modern GPUs have been widely used to accelerate the graph processing for complicated computational problems regarding graph theory. Many parallel graph algorithms adopt the asynch...
Xuanhua Shi, Junling Liang, Sheng Di, Bingsheng He...
PPOPP
2015
ACM
8 years 7 months ago
Stochastic gradient descent on GPUs
Irregular algorithms such as Stochastic Gradient Descent (SGD) can benefit from the massive parallelism available on GPUs. However, unlike in data-parallel algorithms, synchroniz...
Rashid Kaleem, Sreepathi Pai, Keshav Pingali