Abstract In a load balancing network each processor has an initial collection of unit-size jobs, tokens, and in each round, pairs of processors connected by balancers split their l...
We present a novel design and implementation of relational join algorithms for new-generation graphics processing units (GPUs). The most recent GPU features include support for wr...
Bingsheng He, Ke Yang, Rui Fang, Mian Lu, Naga K. ...
—Register allocation, in high-level synthesis and ASIP design, is the process of determining the number of registers to include in the resulting circuit or processor. The goal is...
Decoupled Software Pipelining (DSWP) is one approach to automatically extract threads from loops. It partitions loops into long-running threads that communicate in a pipelined man...
Jialu Huang, Arun Raman, Thomas B. Jablin, Yun Zha...
—Genetic programming (GP), is a very general and efficient technique, often capable of outperforming more specialized techniques on a variety of tasks. In this paper, we suggest ...