There is a growing demand for web applications to provide fair service to the highly concurrent requests. In this paper, we present an approach to addressing this requirement. Bas...
Using multi-GPU systems, including GPU clusters, is gaining popularity in scientific computing. However, when using multiple GPUs concurrently, the conventional data parallel GPU...
Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout ...
The memory hierarchy of a system can consume up to 50% of microprocessor system power. Previous work has shown that tuning a configurable cache to a particular application can red...
Our research goal is to design systems that enable humans to teach tedious, repetitive, simple tasks to a computer. We propose here a learner/problem solver architecture for such ...