In this paper we present the internal representation and optimizations used by the CASH compiler for improving the memory parallelism of pointer-based programs. CASH uses an SSA-b...
Current processors exploit out-of-order execution and branch prediction to improve instruction level parallelism. When a branch prediction is wrong, processors flush the pipeline ...
Hardware compilation techniques which use highlevel programming languages to describe and synthesize hardware are gaining popularity. They are especially useful for reconfigurable...
Parallel software for solving the quadratic program arising in training support vector machines for classification problems is introduced. The software implements an iterative dec...
This paper presents mathematical foundations for the design of a memory controller subcomponent that helps to bridge the processor/memory performance gap for applications with str...
Binu K. Mathew, Sally A. McKee, John B. Carter, Al...