On public communication networks such as the Internet, data confidentiality can be provided by symmetric-key ciphers. One of the most common operations used in symmetric-key ciphers are table lookups. These frequently constitute the largest fraction of the execution time when the ciphers are implemented using a typical RISC-like instruction set. To accelerate these table lookups, we describe a new hardware module, called PTLU (for Parallel Table Lookup), which consists of multiple lookup tables that can be accessed in parallel. A novel combinational circuit included in the module can optionally perform simple logic operations on the data read from the tables. On a single-issue 64-bit RISC processor, PTLU provides maximum speedups of 7.7× for AES and 5.4× for DES. With wordsize scaling, PTLU speedups are significantly higher than that available through more conventional architectural techniques such as superscalar or VLIW execution.
A. Murat Fiskiran, Ruby B. Lee