Abstract—Regular expression matching (REM) with nondeterministic finite automata (NFA) can be computationally expensive when a large number of patterns are matched concurrently. On the other hand, converting the NFA to a deterministic finite automaton (DFA) can cause state explosion, where the number of states and transitions in the DFA are exponentially larger than in the NFA. In this paper, we seek to answer the following question: to match an arbitrary set of regular expressions, is there a finite automaton that lies between the NFA and DFA in terms of computation and memory complexities? We introduce the semideterministic finite automata (SFA) and the state convolvement test to construct an SFA from a given NFA. An SFA consists of a fixed number (p) of constituent DFAs (c-DFA) running in parallel; each c-DFA is responsible for a subset of states in the original NFA. To match a set of regular expressions with n overlapping symbols (that can match to the same input character c...
Yi-Hua E. Yang, Viktor K. Prasanna