The content-based publish/subscribe (pub/sub) paradigm for system design is becoming increasingly popular, offering unique benefits for a large number of data-intensive applications. Coupled with the peer-to-peer technology, it can serve as a central building block for such applications deployed over a large-scale network infrastructure. A key problem toward the creation of large-scale content-based pub/sub infrastructures relates to dealing efficiently with continuous queries (subscriptions) with rich predicates on string attributes; in this work we study the problem of efficiently and accurately matching substring queries to incoming events.