Sciweavers

FPL
2007
Springer

A Load/Store Unit for a Memcpy Hardware Accelerator

14 years 5 months ago
A Load/Store Unit for a Memcpy Hardware Accelerator
Recently, a dedicated hardware accelerator was proposed that works in conjunction with caches found next to modern-day microprocessors, to speedup the commonly utilized memcpy operation. The main assumption of the proposal was that the to-be-memcpy-ed data has to reside inside the cache, which is not always valid. In this paper, we present a dedicated load/store unit and its implementation which cooperates with the previously proposed memcpy hardware accelerator and cache to ensure that data becomes available in the cache. Experimental results, using synthetic benchmarks, show that the load/store unit in conjunction with the memcpy hardware accelerator is capable of reducing the memcpy latencies by 85% (when the data is not present in the cache) compared to a highly optimized, hand-coded in assembly software solution.
Stamatis Vassiliadis, Filipa Duarte, Stephan Wong
Added 07 Jun 2010
Updated 07 Jun 2010
Type Conference
Year 2007
Where FPL
Authors Stamatis Vassiliadis, Filipa Duarte, Stephan Wong
Comments (0)