In this paper, we present a constraint-based approach for selecting and deploying active network components, called proxylets, which execute on servers within the network to perfo...
—In an attempt to increase the performance/cost ratio, large compute clusters are becoming heterogeneous at multiple levels: from asymmetric processors, to different system archi...
— Skewed-associative caches use several hash functions to reduce collisions in caches without increasing the associativity. This technique can increase the hit ratio of a cache w...
Henk L. Muller, Paul W. A. Stallard, David H. D. W...
Parallel bit stream algorithms exploit the SWAR (SIMD within a register) capabilities of commodity processors in high-performance text processing applications such as UTF8 to UTF-...
—This paper proposes a parallelization scheme for parameter sweep (PS) applications using the compute unified device architecture (CUDA). Our scheme focuses on PS applications w...