This paper proposes a new way of efficiently doing arbitrary ¢ -bit permutations in programmable processors modeled on the theory of omega and flip networks. The new omflip instruction we introduce can perform any permutation of ¢ subwords in £¥¤§¦¨¢ instructions, with the subwords ranging from half-words down to single bits. Each omflip instruction can be done in a single cycle, with very efficient hardware implementation. The omflip instruction enhances a programmable processor’s capability for handling multimedia and security applications which use subword permutations extensively.
Xiao Yang, Ruby B. Lee