This paper proposes a massively parallel hardware architecture for efficient tracing of incoherent rays, e.g. for global illumination. The general approach is centered around hier...
Abstract--Modular multiplication of long integers is an important building block for cryptographic algorithms. Although several FPGA accelerators have been proposed for large modul...
Gary Chun Tak Chow, Ken Eguro, Wayne Luk, Philip L...
3D stacked circuits reduce communication delay in multicore system-on-chips (SoCs) and enable heterogeneous integration of cores, memories, sensors, and RF devices. However, vertic...
Mohamed M. Sabry, Ayse Kivilcim Coskun, David Atie...
Single-thread performance, reliability and power efficiency are critical design challenges of future multicore systems. Although point solutions have been proposed to address thes...
Shantanu Gupta, Shuguang Feng, Amin Ansari, Scott ...
Wide Single Instruction, Multiple Thread (SIMT) architectures often require a static allocation of thread groups that are executed in lockstep throughout the entire application ker...