This paper presents a novel VLSI architecture for high-speed data compressor designs which implement the well-known LZ77 algorithm. The architecture mainly consists of three units...
Most programs are repetitive, where similar behavior can be seen at different execution times. Algorithms have been proposed that automatically group similar portions of a program...
The architecture for a shared memory CPU is described. The CPU allows for parallelism down to the level of single instructions and is tolerant of memory latency. All executable in...
We present a unified approach to locality optimization that employs both data and control transformations. Data transformations include changing the array layout in memory. Contr...
Providing shared-memory abstraction in messagepassing systems often simplifies the development of distributed algorithms and allows for the reuse of sharedmemory algorithms in the...