The slow speed of conventional execution-driven architecture simulators is a serious impediment to obtaining desirable research productivity. This paper proposes and evaluates a f...
We design an incentive-compatible mechanism for scheduling n non-malleable parallel jobs on a parallel system comprising m identical processors. Each job is owned by a selfish us...
JPEG2000 is the latest still image coding standard from the JPEG committee, which adopts new algorithms such as Embedded Block Coding with Optimized Truncation (EBCOT) and Discret...
We introduce Scioto, Shared Collections of Task Objects, a lightweight framework for providing task management on distributed memory machines under one-sided and globalview parall...
James Dinan, Sriram Krishnamoorthy, D. Brian Larki...
This paper presents a novel stateless, virtualized communication engine for sub-microsecond latency. Using a Field-Programmable-Gate-Array (FPGA) based prototype we show a latency...
— Location Dependent Information Services (LDISs) are gaining increasing popularity in recent years, and due to limited client power and intermittent connectivity, caching is an ...
We identify the challenges that are special to parallel sparse matrix-matrix multiplication (PSpGEMM). We show that sparse algorithms are not as scalable as their dense counterpar...
Abstract—Federated systems have recently attracted much attention because they allow loosely coupled organizations to share resources for common benefits. However, discovering r...
Abstract—The Sparse Matrix-Vector Multiplication kernel exhibits limited potential for taking advantage of modern shared memory architectures due to its large memory bandwidth re...
Kornilios Kourtis, Georgios I. Goumas, Nectarios K...