Sciweavers

CGO
2004
IEEE
13 years 11 months ago
Physical Experimentation with Prefetching Helper Threads on Intel's Hyper-Threaded Processors
Pre-execution techniques have received much attention as an effective way of prefetching cache blocks to tolerate the everincreasing memory latency. A number of pre-execution tech...
Dongkeun Kim, Shih-Wei Liao, Perry H. Wang, Juan d...
CGO
2004
IEEE
13 years 11 months ago
Targeted Path Profiling: Lower Overhead Path Profiling for Staged Dynamic Optimization Systems
In this paper, we present a technique for reducing the overhead of collecting path profiles in the context of a dynamic optimizer. The key idea to our approach, called Targeted Pa...
Rahul Joshi, Michael D. Bond, Craig B. Zilles
CGO
2004
IEEE
13 years 11 months ago
Compiler Optimization of Memory-Resident Value Communication Between Speculative Threads
Efficient inter-thread value communication is essential for improving performance in Thread-Level Speculation (TLS). Although several mechanisms for improving value communication ...
Antonia Zhai, Christopher B. Colohan, J. Gregory S...
CGO
2004
IEEE
13 years 11 months ago
Using Dynamic Binary Translation to Fuse Dependent Instructions
Instruction scheduling hardware can be simplified and easily pipelined if pairs of dependent instructions are fused so they share a single instruction scheduling slot. We study an...
Shiliang Hu, James E. Smith
CGO
2004
IEEE
13 years 11 months ago
Exploring Code Cache Eviction Granularities in Dynamic Optimization Systems
Dynamic optimization systems store optimized or translated code in a software-managed code cache in order to maximize reuse of transformed code. Code caches store superblocks that...
Kim M. Hazelwood, James E. Smith
CGO
2004
IEEE
13 years 11 months ago
Exposing Memory Access Regularities Using Object-Relative Memory Profiling
Memory profiling is the process of characterizing a program's memory behavior by observing and recording its response to specific input sets. Relevant aspects of the program&...
Qiang Wu, Artem Pyatakov, Alexey Spiridonov, Easwa...
CGO
2004
IEEE
13 years 11 months ago
VHC: Quickly Building an Optimizer for Complex Embedded Architectures
To meet the high demand for powerful embedded processors, VLIW architectures are increasingly complex (e.g., multiple clusters), and moreover, they now run increasingly sophistica...
Michael Dupré, Nathalie Drach, Olivier Tema...
CGO
2004
IEEE
13 years 11 months ago
The Accuracy of Initial Prediction in Two-Phase Dynamic Binary Translators
Dynamic binary translators use a two-phase approach to identify and optimize frequently executed code dynamically. In the first step (profiling phase), blocks of code are interpre...
Youfeng Wu, Mauricio Breternitz Jr., Justin Quek, ...
CGO
2004
IEEE
13 years 11 months ago
A Compiler Scheme for Reusing Intermediate Computation Results
Recent research has shown that programs often exhibit value locality. Such locality occurs when a code segment, although executed repeatedly in the program, takes only a small num...
Yonghua Ding, Zhiyuan Li