An asynchronous work-stealing implementation of dynamic load balance is implemented using Unified Parallel C (UPC) and evaluated using the Unbalanced Tree Search (UTS) benchmark ...
A rising horizon in chip fabrication is the 3D integration technology. It stacks two or more dies vertically with a dense, high-speed interface to increase the device density and ...
Xiuyi Zhou, Yi Xu, Yu Du, Youtao Zhang, Jun Yang 0...
Strategies are developed for “fattening” the tasks of computation-dags so as to accommodate the heterogeneity of remote clients in Internet-based computing (IC). Earlier work ...
—Transactional Memory (TM) takes responsibility for concurrent, atomic execution of labeled regions of code, freeing the programmer from the need to manage locks. Typical impleme...
Michael F. Spear, Michael Silverman, Luke Dalessan...
—We consider resource allocation for distributed streaming applications running in a grid environment, where continuously streaming data needs to be aggregated and processed to p...
The issue queue (IQ) is a key microarchitecture structure for exploiting instruction-level and thread-level parallelism in dynamically scheduled simultaneous multithreaded (SMT) p...
Abstract— We examine the problem of parallelizing the inferencing process for OWL knowledge-bases. A key challenge in this problem is partitioning the computational workload of t...
—Data replications is a typical strategy for improving access performance and data availability in Data Grid systems. Current works on data replication in Grid systems focus on t...
The shading processors in graphics hardware are becoming increasingly general-purpose. We test, through simulation and benchmarking, the potential performance impact of replacing ...
Thomas M. DuBois, Bryant Lee, Yi Wang, Marc Olano,...