In this paper we propose a dynamic code overlay technique of synchronous data-flow (SDF) –modeled program for low-end embedded systems which lack MMUsupport. With this technique, the system can utilize expensive SRAM memory more efficiently by using flash memory as code storage. SRAM is divided into several regions called overlay slots. A data-flow block or a cluster of data-flow blocks is loaded into the corresponding overlay slot on demand at run-time. Which blocks are clustered together and which overlay slots are allocated to the clusters are statically decided by the clustering and placement algorithm. We also propose an automatic code generation framework that generates the C-program code, dynamic loader and linker script files from the given SDFmodeled blocks and schematic, so we can run or simulate the program immediately without any additional coding effort. Experiments report that we can reduce the SRAM size significantly with a reasonable amount of time overhead for sever...