Sciweavers

IEEEPACT
2005
IEEE
14 years 5 months ago
Characterization of TCC on Chip-Multiprocessors
Transactional Coherence and Consistency (TCC) is a novel coherence scheme for shared memory multiprocessors that uses programmer-defined transactions as the fundamental unit of p...
Austen McDonald, JaeWoong Chung, Hassan Chafi, Chi...
IEEEPACT
2005
IEEE
14 years 5 months ago
Memory Coloring: A Compiler Approach for Scratchpad Memory Management
Scratchpad memory (SPM), a fast software-managed onchip SRAM, is now widely used in modern embedded processors. Compared to hardware-managed cache, it is more efficient in perfor...
Lian Li 0002, Lin Gao 0002, Jingling Xue
IEEEPACT
2005
IEEE
14 years 5 months ago
HUNTing the Overlap
Hiding communication latency is an important optimization for parallel programs. Programmers or compilers achieve this by using non-blocking communication primitives and overlappi...
Costin Iancu, Parry Husbands, Paul Hargrove
IEEEPACT
2005
IEEE
14 years 5 months ago
Communication Optimizations for Fine-Grained UPC Applications
Global address space languages like UPC exhibit high performance and portability on a broad class of shared and distributed memory parallel architectures. The most scalable applic...
Wei-Yu Chen, Costin Iancu, Katherine A. Yelick
IEEEPACT
2005
IEEE
14 years 5 months ago
Maximizing CMP Throughput with Mediocre Cores
In this paper we compare the performance of area equivalent small, medium, and large-scale multithreaded chip multiprocessors (CMTs) using throughput-oriented applications. We use...
John D. Davis, James Laudon, Kunle Olukotun
IEEEPACT
2005
IEEE
14 years 5 months ago
Performance Analysis of System Overheads in TCP/IP Workloads
Current high-performance computer systems are unable to saturate the latest available high-bandwidth networks such as 10 Gigabit Ethernet. A key obstacle in achieving 10 gigabits ...
Nathan L. Binkert, Lisa R. Hsu, Ali G. Saidi, Rona...
IEEEPACT
2005
IEEE
14 years 5 months ago
A Simple Divide-and-Conquer Approach for Neural-Class Branch Prediction
The continual demand for greater performance and growing concerns about the power consumption in highperformance microprocessors make the branch predictor a critical component of ...
Gabriel H. Loh
IEEEPACT
2005
IEEE
14 years 5 months ago
Trace Cache Sampling Filter
This paper presents a new technique for efficient usage of small trace caches. A trace cache can significantly increase the performance of wide out-oforder processors, but to be e...
Michael Behar, Avi Mendelson, Avinoam Kolodny
IEEEPACT
2005
IEEE
14 years 5 months ago
Future Execution: A Hardware Prefetching Technique for Chip Multiprocessors
This paper proposes a new hardware technique for using one core of a CMP to prefetch data for a thread running on another core. Our approach simply executes a copy of all non-cont...
Ilya Ganusov, Martin Burtscher