Loop Scheduling with Complete Memory Latency Hiding on Multi-core Architecture

15 years 10 months ago

Download www.cs.cityu.edu.hk

The widening gap between processor and memory performance is the main bottleneck for modern computer systems to achieve high processor utilization. In this paper, we propose a new loop scheduling with memory management technique, Iterational Retiming with Partitioning (IRP), that can completely hide memory latencies for applications with multi-dimensional loops on architectures like CELL processor [1]. In IRP, the iteration space is ﬁrst partitioned carefully. Then a two-part schedule, consisting of processor and memory parts, is produced such that the execution time of the memory part never exceeds the execution time of the processor part. These two parts are executed simultaneously and complete memory latency hiding is reached. Experiments on DSP benchmarks show that IRP consistently produces optimal solutions as well as signiﬁcant improvement over previous techniques.

Chun Xue, Zili Shao, Meilin Liu, Mei Kang Qiu, Edw

Real-time Traffic

ICPADS 2006 | Memory Latencies | Memory Management Technique | Memory Performance |

claim paper

» An Advanced Optimizer for the IA64 Architecture

» Improving Balanced Scheduling with Compiler Optimizations that Increase InstructionLevel P...

» Tuning Compiler Optimizations for Simultaneous Multithreading

Post Info
More Details (n/a)

Added	11 Jun 2010
Updated	11 Jun 2010
Type	Conference
Year	2006
Where	ICPADS
Authors	Chun Xue, Zili Shao, Meilin Liu, Mei Kang Qiu, Edwin Hsing-Mean Sha

Comments (0)

Sciweavers

Loop Scheduling with Complete Memory Latency Hiding on Multi-core Architecture

ICPADS 2006 | Memory Latencies | Memory Management Technique | Memory Performance |

Explore & Download

Productivity Tools

Sciweavers