Embedded digital signal processors for baseband communication systems have stringent design constraints including high computational bandwidth, low power consumption, and low inter...
Michael J. Schulte, C. John Glossner, Suman Mamidi...
The NVIDIA® OptiX™ ray tracing engine is a programmable system designed for NVIDIA GPUs and other highly parallel architectures. The OptiX engine builds on the key observation ...
Steven G. Parker, James Bigler, Andreas Dietrich, ...
We explore runtime mechanisms and policies for scheduling dynamic multi-grain parallelism on heterogeneous multi-core processors. Heterogeneous multi-core processors integrate con...
Filip Blagojevic, Dimitrios S. Nikolopoulos, Alexa...
Grid systems are well-known for its high performance computing or large data storage with inexpensive devices. They can be categorized into two major types: computational grid and ...
—The performance bottleneck for many scientific applications is the cost of memory access inside linear algebra kernels. Tuning such kernels for memory efficiency is a complex ...