This study compares the speed, area, and power of di erent implementations of Active Pages OCS98], an intelligent memory system which helps bridge the growing gap between processor and memory performance by associating simple functions with each page of data. Previous investigations have shown up to 1000X speedups using a block of recon gurable logic to implement these functions next to each subarray on a DRAM chip. In this study, we show that instruction-level parallelism, not hardware specialization, is the key to the previous success with recon gurable logic. In order to demonstrate this fact, an Active Page implementation based upon a simpli ed VLIW processor was developed. Unlike conventional VLIW processors, power and area constraints lead to a design which has a small number of pipeline stages. Our results demonstrate that a four-wide VLIW processor attains comparable performance to that of pure FPGA logic but requires signi cantly less area and power.