The information about the run-time behavior of software applications is crucial for enabling system level optimizations for embedded systems. This embedded Software Metadata information is especially important today, because several complex multi-threaded applications are mapped on the memory of a single embedded system. Each thread is triggered at run-time by different input events that can not be predicted at design-time. New methods and tools are needed to automatically profile and analyze the dynamic data access behavior of simultaneously executing threads in order to enable memory data transfer optimizations. In this paper, we propose such a method and tool which extract the necessary Software Metadata information to enable these data transfer optimizations at the system level. We assess the effectiveness of our approach with the results for five real-life software applications using seven real-life run-time input traces.