We describe the design, implementation and performance of a high-performance Web server accelerator which runs on an embedded operating system and improves Web server performance ...
Junehwa Song, Arun Iyengar, Eric Levy-Abegnoli, Da...
The distribution of resources among processors, memory and caches is a crucial question faced by designers of large-scale parallel machines. If a machine is to solve problems with...
Microprocessor designers have been torn between tight constraints on the amount of on-chip cache memory and the high latency of off-chip memory, such as dynamic random access memor...
Xi Chen, Lei Yang, Robert P. Dick, Li Shang, Haris...
Abstract--One important bottleneck when visualizing large data sets is the data transfer between processor and memory. Cacheaware (CA) and cache-oblivious (CO) algorithms take into...
Effective use of the memory hierarchy is critical for achieving high performance on embedded systems. We focus on the class of streaming applications, which is increasingly preval...
Janis Sermulins, William Thies, Rodric M. Rabbah, ...