We propose an algorithm for modular multiplication/division suitable for VLSI implementation. The algorithm is based on Montgomery’s method for modular multiplication and on the extended Binary GCD algorithm for modular division. It can perform either of these operations with a reduced amount of hardware. Both calculations are carried out through iterations of simple operations such as shifts and additions/subtractions. The radix-2 signed-digit representation is employed so that all additions and subtractions are performed without carry propagation. A modular multiplier/divider based on this algorithm has a linear array structure with a bit-slice feature and carries out an n-bit modular multiplication in at most 2(n+2) 3 + 3 clock cycles and an n-bit modular division in at most 2n+5 clock cycles, where the length of the clock cycle is constant and independent of n.
Marcelo E. Kaihara, Naofumi Takagi