Abstract. Dynamic data redistribution enhances data locality and improves algorithm performance for numerous scientific problems on distributed memory multi-computers systems. Prev...
Fibonacci Cubes (FCs), together with the enhanced and extended forms, are a family of interconnection topologies formed by diluting links from binary hypercube. While they scale up...
Abstract. Power-balanced instruction scheduling for Very Long Instruction Word (VLIW) processors is an optimization problem which requires a good instruction-level power model for ...
Shu Xiao, Edmund Ming-Kit Lai, A. Benjamin Premkum...
Abstract. Given the increasing gap between processors and memory, prefetching data into cache becomes an important strategy for preventing the processor from being starved of data....
In order to enhance the performance of a computer, most modern processors use superscalar architecture and raise the clock frequency. Superscalar architecture can execute more than...
Conditional branch induced control hazards cause significant performance loss in modern out-of-order superscalar processors. Dynamic branch prediction techniques help alleviate th...