Sciweavers

315 search results - page 49 / 63
» Improving Performance by Branch Reordering
Sort
View
133
Voted
ICS
2007
Tsinghua U.
15 years 9 months ago
Automatic nonblocking communication for partitioned global address space programs
Overlapping communication with computation is an important optimization on current cluster architectures; its importance is likely to increase as the doubling of processing power ...
Wei-Yu Chen, Dan Bonachea, Costin Iancu, Katherine...
136
Voted
ASPLOS
2011
ACM
14 years 7 months ago
On-the-fly elimination of dynamic irregularities for GPU computing
The power-efficient massively parallel Graphics Processing Units (GPUs) have become increasingly influential for scientific computing over the past few years. However, their ef...
Eddy Z. Zhang, Yunlian Jiang, Ziyu Guo, Kai Tian, ...
122
Voted
POPL
1994
ACM
15 years 7 months ago
Reducing Indirect Function call Overhead in C++ Programs
Modern computer architectures increasingly depend on mechanisms that estimate future control flow decisions to increase performance. Mechanisms such as speculative execution and p...
Brad Calder, Dirk Grunwald
104
Voted
ISCA
1999
IEEE
124views Hardware» more  ISCA 1999»
15 years 8 months ago
The Block-Based Trace Cache
The trace cache is a recently proposed solution to achieving high instruction fetch bandwidth by buffering and reusing dynamic instruction traces. This work presents a new block-b...
Bryan Black, Bohuslav Rychlik, John Paul Shen
113
Voted
MM
2006
ACM
108views Multimedia» more  MM 2006»
15 years 9 months ago
Video search reranking via information bottleneck principle
We propose a novel and generic video/image reranking algorithm, IB reranking, which reorders results from text-only searches by discovering the salient visual patterns of relevant...
Winston H. Hsu, Lyndon S. Kennedy, Shih-Fu Chang