processor arrays can be used as accelerators for a plenty of data flow-dominant applications. The explosive growth in research and development of massively parallel processor array architectures has lead to demand for mapping tools to realize the full potential of these architectures. Such architectures are characterized by hierarchies of parallelism and memory structures, i.e. processor array apart from different levels of Cache have large number of processing elements (PE) where each PE can further contain Sub-Word parallelism. In order to handle large scale problems, balance local memory requirements with I/O-bandwidth, and use different hierarchies of parallelism and memory one needs sophisticated transformation called hierarchical partitioning. In this paper, we introduce for the first time a detailed methodology encompassing hierarchical partitioning.