Many techniques for increasing the amount of instruction-level parallelism (ILP) put increased pressure on the registers inside a CPU. These techniques allow for more operations to occur simultaneously at the cost of requiring more registers to hold the operands and results of those operations, and importantly, more ports on the register banks to allow for concurrent access to the data. One approach of ameliorating the number of ports on a register bank (the cost of ports in gates varies as Æ ¾ where Æ is the number of ports, and adding ports increases access time) is to have multiple register banks with fewer ports, each attached to a subset of the available functional units. This reduces the number of ports needed on a per-bank basis, but can slow operations if a necessary value is not in an attached register bank as copy operations must be inserted. Therefore, there is a circular dependence between assigning operations to functional units and assigning values to register banks. ...
Jason Hiser, Steve Carr, Philip H. Sweany, Steven