Combining ideas from several previous proposals, such as Active Pages, DIVA, and ULMT, we present the Memory Arithmetic Unit and Interface (MAUI) architecture. Because the “intelligence” of the MAUI intelligent memory system architecture is located in the memory-controller, logic and DRAM are not required to be integrated into a single chip, and use of off-the-shelf DRAMs is permitted. The MAUI’s computational engine performs memory-bound SIMD computations close to the memory system, enabling more efficient memory pipelining. A simulator modeling the MAUI architecture was added to the SimpleScalar v4.0 tool-set. Not surprisingly, simulations show that application speedup increases as the memory system speed increases and the dataset size increases. Simulation results show single-threaded application speedup of over 100% is possible, and suggest that a total system speedup of about 300% is possible in a multi-threaded environment. General Terms Performance Keywords intelligent ...
Justin Teller, Charles B. Silio Jr., Bruce L. Jaco