The implementation of modern high performance computer is increasingly directed toward parallelism in the hardware. However, most of the current fetch units are limited to one branch prediction per cycle and therefore, can fetch no more than one basic block per cycle. While fetching a single basic block each cycle is sufficient for implementations that issue at most four instructions per cycle, it is not for processors with higher peak issue rates. If multiple block prediction is used, the fetch unit can at least fetch multiple contiguous basic blocks. There are two essential components to provide the ability to fetch more than one basic block each cycle