To leave the maximum flexibility in encoder to optimize the trade-off between coding performance and complexity, in video coding standards such as H.264/AVC [1], H.263 [2] and MPEG4 [3] etc, any number of B pictures and any arrangement of P pictures within a GOP of arbitrary length are permitted. In addition, the multiple reference picture prediction is also permitted in some video coding systems such as H.264/AVC to achieve the efficient coding by allowing the encoder to select reference pictures among a large number of coded pictures. Both of the above cases without fixing the temporal distance between forward and backward reference pictures will require the division operation for deriving the motion vectors of direct mode, which can efficiently exploit the temporal correlation among pictures and does not require any bits for coding the motion vectors. However, the division is an expensive and undesired operation in video decoder hardware design. Although H.264/AVC video standard ha...