We introduce the first binary search tree algorithm designed for speculative executions. Prior to this work, tree structures were mainly designed for their pessimistic (non-specu...
This paper describes a compiler for stream programs that efficiently schedules computational kernels and stream memory operations, and allocates on-chip storage. Our compiler uses...
Excessive power consumption is becoming a major barrier to extracting the maximum performance from high-performance parallel systems. Therefore, techniques oriented towards reduci...
Prefetching offers the potential to improve the performance of linked data structure (LDS) traversals. However, previously proposed prefetching methods only work well when there i...
Magnus Karlsson, Fredrik Dahlgren, Per Stenstr&oum...
Multi-FPGA systems (MFSs) are used as custom computing machines, logic emulators and rapid prototyping vehicles. A key aspect of these systems is their programmable routing archit...