Memory interleaving is a cost-efficient approach to increase bandwidth. Improving data access locality and reducing memory access conflicts are two important aspects to achieve high efficiency for interleaved memory. In this paper, we introduce a design framework that integrates these two optimizations, in order to find out minimal memory banks and channels required in the embedded system under performance restriction. Several important techniques, loop and data layout transformations for data access locality, extracting data streams, conflict cache miss reduction as well as data placement and optimally reordered access for interleaved memories, are incorporated in the design framework. Experiments show that our co-design method results in substantially less hardware requirement compared to the implementations without optimization. Keywords Interleaved memory systems, data access locality, memory access conflict, in-dimension-stride vector, extracted data stream, optimally reordered a...