Sciweavers

IEEEPACT
2005
IEEE
14 years 2 months ago
Memory State Compressors for Giga-Scale Checkpoint/Restore
We propose a checkpoint store compression method for coarse-grain giga-scale checkpoint/restore. This mechanism can be useful for debugging, post-mortem analysis and error recover...
Andreas Moshovos, Alexandros Kostopoulos
IEEEPACT
2005
IEEE
14 years 2 months ago
Characterization of TCC on Chip-Multiprocessors
Transactional Coherence and Consistency (TCC) is a novel coherence scheme for shared memory multiprocessors that uses programmer-defined transactions as the fundamental unit of p...
Austen McDonald, JaeWoong Chung, Hassan Chafi, Chi...
IEEEPACT
2005
IEEE
14 years 2 months ago
Memory Coloring: A Compiler Approach for Scratchpad Memory Management
Scratchpad memory (SPM), a fast software-managed onchip SRAM, is now widely used in modern embedded processors. Compared to hardware-managed cache, it is more efficient in perfor...
Lian Li 0002, Lin Gao 0002, Jingling Xue
IEEEPACT
2005
IEEE
14 years 2 months ago
HUNTing the Overlap
Hiding communication latency is an important optimization for parallel programs. Programmers or compilers achieve this by using non-blocking communication primitives and overlappi...
Costin Iancu, Parry Husbands, Paul Hargrove
IEEEPACT
2005
IEEE
14 years 2 months ago
Communication Optimizations for Fine-Grained UPC Applications
Global address space languages like UPC exhibit high performance and portability on a broad class of shared and distributed memory parallel architectures. The most scalable applic...
Wei-Yu Chen, Costin Iancu, Katherine A. Yelick
IEEEPACT
2005
IEEE
14 years 2 months ago
Maximizing CMP Throughput with Mediocre Cores
In this paper we compare the performance of area equivalent small, medium, and large-scale multithreaded chip multiprocessors (CMTs) using throughput-oriented applications. We use...
John D. Davis, James Laudon, Kunle Olukotun
IEEEPACT
2005
IEEE
14 years 2 months ago
Performance Analysis of System Overheads in TCP/IP Workloads
Current high-performance computer systems are unable to saturate the latest available high-bandwidth networks such as 10 Gigabit Ethernet. A key obstacle in achieving 10 gigabits ...
Nathan L. Binkert, Lisa R. Hsu, Ali G. Saidi, Rona...
IEEEPACT
2005
IEEE
14 years 2 months ago
A Simple Divide-and-Conquer Approach for Neural-Class Branch Prediction
The continual demand for greater performance and growing concerns about the power consumption in highperformance microprocessors make the branch predictor a critical component of ...
Gabriel H. Loh