In order to achieve high performance, contemporary microprocessors must effectively process the four major instruction types: ALU, branch, load, and store instructions. This paper focuses on the reduction of load instruction execution latency. Load execution latency is dependent on memory access latency, pipeline depth, and data dependencies. Through load effective address prediction both data dependencies and deep pipeline effects can potentially be removed from the overall execution time. If a load effective address is correctly predicted, the data cache can be speculatively accessed prior to execution, thus effectively reducing the latency of load execution. A hybrid load effective address prediction technique is proposed, using three basic predictors: Last Address Predictor (LAP), Stride Predictor (SP), and Global Dynamic Predictor (GDP). In addition to improving load address prediction, this work explores the balance of data ports in the cache memory hierarchy, and the effects of...