Inherent within complex instruction set architectures such as x86 are inefficiencies that do not exist in a simpler ISAs. Modern x86 implementations decode instructions into one o...
Brian Slechta, David Crowe, Brian Fahs, Michael Fe...
A programmable parallel digital signal processor (DSP) core for embedded applications is presented which combines the concepts of single instruction stream over multiple data stre...
Liang Han, Jie Chen, Chaoxian Zhou, Ying Li, Xin Z...
The Cell BE is a multicore processor with eight vector accelerators (called SPEs) that implement explicit cache management through direct memory access engines. While the Cell has...
Srinivas Chellappa, Franz Franchetti, Markus P&uum...
One approach to high-performance processing of massive data sets is to incorporate computation into storage systems. Previous work has shown that this active storage model is effe...
Rajiv Wickremesinghe, Jeffrey S. Chase, Jeffrey Sc...
Multiple issue of instructions occurs in superscalar and VLIW machines. This paper investigates a third type of machine design, which combines the advantages of code compatibility...