Dependences among loads and stores whose addresses are unknown hinder the extraction of instruction level parallelism during the execution of a sequential program. Such ambiguous ...
Sridhar Gopal, T. N. Vijaykumar, James E. Smith, G...
A novel dynamic register renaming approach is proposed in this work. The key idea of the novel scheme is to delay the allocation of physical registers until a late stage in the pi...
Scalability and reliability are inseparable in high-performance computing. Fault-isolation through hardware is a popular means of providing reliability. Unfortunately, such isolat...
Scalable shared-memory multiprocessors that are designed as Cache-Only Memory Architectures Coma allow automatic replication and migration of data in the main memory. This enhance...
Over the past few years there has been increased interest in building custom computing machines (CCMs) as a way of achieving very high performance on specific problems. The advent...
David Abramson, Paul Logothetis, Adam Postula, Mar...
This paper describes PRISM, a distributed sharedmemory architecture that relies on a tightly integrated hardware and operating system design for scalable and reliable performance....
Abstract. Dynamic program optimization is the only recourse for optimizing compilers when machine and program parameters necessary for applying an optimization technique are unknow...
On-chip multiprocessor can be an alternative to the wide-issue superscalar processor approach which is currently the mainstream to exploit the increasing number of transistors on ...