An approach is presented for high throughput matching of regular expressions (regexes) by first converting them into corresponding Non-deterministic Finite Automata (NFAs) which are then configured onto a FPGA. The key novel feature is a technique that, for any given regex, constructs an NFA that processes multiple characters per clock cycle. An efficient algorithm is proposed that outputs an NFA which processes twice the number of characters as the input one. A technique is also proposed that implements the range match operation (e.g. [a-z]) efficiently. A program has been written that implements above ideas to convert regexes into NFAs specified in a structural Hardware Design Language (HDL), which are then mapped onto a FPGA. Performance is evaluated using real world regexes (Snort ruleset). The results demonstrate the practical utility of the approach. For example, for a set of 2,691 regexes, while the standard 1
Norio Yamagaki, Reetinder P. S. Sidhu, Satoshi Kam