Abstract—In this paper, we introduce FlowSifter, a systematic framework for online application protocol field extraction. FlowSifter introduces a new grammar model Counting Regular Grammars (CRG) and a corresponding automata model Counting Automata (CA). The CRG and CA models add counters with update functions and transition guards to regular grammars and finite state automata. These additions give CRGs and CAs the ability to parse and extract fields from context sensitive application protocols. These additions also facilitate fast and stackless approximate parsing of recursive structures. These new grammar models enable FlowSifter to generate optimized Layer 7 field extractors from simple extraction specifications. In our experiments, we compare FlowSifter against both BinPAC and UltraPAC, which are the freely available state of the art field extractors. Our experiments show that when compared to UltraPAC parsers, FlowSifter extractors run 84% faster and use 12% of the memory....
Chad R. Meiners, Eric Norige, Alex X. Liu, Eric To