This paper presents an algorithm for exact pattern matching based on a new type of Bloom filter that we call a feed-forward Bloom filter. Besides filtering the input corpus, a feed-forward Bloom filter is also able to reduce the set of patterns needed for the exact matching phase. We show that this technique, along with a CPU architecture aware design of the Bloom filter, can provide speedups between 2× and 30×, and memory consumption reductions as large as 50× when compared with grep, while the filtering speed can be as much as 5× higher than that of a normal Bloom filters. This research was supported by grants from the National Science Foundation, Google, Network Appliance, Intel Corporation and Carnegie Mellon Cylab.
Iulian Moraru, David G. Andersen