Application-specific system-on-chip platforms create the opportunity to customize the cache configuration for optimal performance with minimal chip estate. Simulation, in particular trace-driven simulation, is widely used to estimate cache hit rates. However, simulation is too slow to be deployed in the design space exploration, specially when it involves hundreds of design points and huge traces or long program execution. In this paper, we propose a novel static analysis technique for rapid and accurate design space exploration of instruction caches. Given the program control flow graph (CFG) annotated only with basic block and control flow edge execution counts, our analysis estimates the hit rates for multiple cache configurations in one pass. We achieve this by modeling the cache states at each node of the CFG in probabilistic manner and exploiting the structural similarities among related cache configurations. Experimental results indicate that our analysis is 24–3,855 ti...