— SuperMatrix out-of-order scheduling leverages el abstractions and straightforward data dependency analysis to provide a general-purpose mechanism for obtaining parallelism from...
Ernie Chan, Field G. Van Zee, Enrique S. Quintana-...
Using multi-GPU systems, including GPU clusters, is gaining popularity in scientific computing. However, when using multiple GPUs concurrently, the conventional data parallel GPU...
In this paper we present an algorithm for scheduling parallel applications that consist of a divisible workload. Our algorithm uses multiple rounds to overlap communication and co...
Efficient execution of well-parallelized applications is central to performance in the multicore era. Program analysis tools support the hardware and software sides of this effor...
Next-generation high-end Network Processors (NP) must address demands from both diversified applications and ever-increasing traffic pressure. One major challenge is to design an e...
Lei Shi, Yue Zhang 0006, Jianming Yu, Bo Xu, Bin L...