Exploiting instruction-level parallelism (ILP) is extremely important for achieving high performance in application specific instruction set processors (ASIPs) and embedded processors. Unlike conventional general purpose processors, ASIPs and embedded processors typically run a single application and hence must be optimized extensively for this in order to extract maximum performance. Further, low power and low cost requirements of ASIPs may demand reuse of pipeline stages causing pipelines with complex structural hazards. In such architectures, exploiting higher ILP is a major challenge to the designer. Existing techniques deal with either scheduling hardware pipelines to obtain higher throughput or software pipelining
Ramaswamy Govindarajan, Erik R. Altman, Guang R. G