Overlapping communication with computation is an important optimization on current cluster architectures; its importance is likely to increase as the doubling of processing power ...
Wei-Yu Chen, Dan Bonachea, Costin Iancu, Katherine...
The power-efficient massively parallel Graphics Processing Units (GPUs) have become increasingly influential for scientific computing over the past few years. However, their ef...
Eddy Z. Zhang, Yunlian Jiang, Ziyu Guo, Kai Tian, ...
Modern computer architectures increasingly depend on mechanisms that estimate future control flow decisions to increase performance. Mechanisms such as speculative execution and p...
The trace cache is a recently proposed solution to achieving high instruction fetch bandwidth by buffering and reusing dynamic instruction traces. This work presents a new block-b...
We propose a novel and generic video/image reranking algorithm, IB reranking, which reorders results from text-only searches by discovering the salient visual patterns of relevant...