— Effective address calculation for load and store instructions needs to compete for ALU with other instructions and hence extra latencies might be incurred to data cache accesses. Fast address generation is an approach proposed to reduce cache access latencies. This paper presents a fast address generator that can eliminate most of the effective address computations. Experimental results show that this fast address generator can reduce effective address computations of load and store instructions by about 74% on average for SPECint2000 benchmarks and cut the execution times by 8.5%. In addition, further improvement can be made if data of previous load operations are buffered in the unused data field of LSQ entries as well. Runtime impact will expand to 10.5% on average when the default LSQ is modified to the cached LSQ design.