SLEEF (SIMD Library for Evaluating Elementary Functions) is a library that facilitates programming with SIMD instructions. It implements the trigonometric functions, inverse trigon...
Data-parallel architectures like SIMD (Single Instruction Multiple Data) or SIMT (Single Instruction Multiple Thread) have been adopted in many recent CPU and GPU architectures. Al...
The GPU leverages SIMD efficiency when shading because it rasterizes a triangle at a time, running the same shader on all of its fragments. Ray tracing sacrifices this shader cohe...
Jared Hoberock, Victor Lu, Yuntao Jia, John C. Har...
Power has become the most critical design constraint for embedded handheld devices. This paper proposes a power-efficient SIMD architecture, referred to as Diet SODA, for DSP appl...
Sangwon Seo, Ronald G. Dreslinski, Mark Woh, Chait...
This paper presents Xetal-Pro SIMD processor, which is based on Xetal-II, one of the most computational-efficient (in terms of GOPS/Watt) processors available today. XetalPro supp...
Yifan He, Yu Pu, Richard P. Kleihorst, Zhenyu Ye, ...
In modern wireless devices, two broad classes of compute-intensive applications are common: those with high amounts of data-level parallelism, such as signal processing used in wi...
Ganesh S. Dasika, Mark Woh, Sangwon Seo, Nathan Cl...
We propose an application specific processor for computational quantum chemistry. The kernel of interest is the computation of electron repulsion integrals (ERIs), which vary in c...
Tirath Ramdas, Gregory K. Egan, David Abramson, Ki...
The Rewrite Rule Machine (RRM) is a massively parallel MIMD/SIMD computer designed with the explicit purpose of supporting veryhigh-level parallel programming with rewrite rules. T...