We show how to achieve typed and unambiguous declarative pattern matching on strings using regular expressions extended with a simple recording operator. We give a characterization of ambiguity of regular expressions that leads to a sound and complete static analysis. The analysis is capable of pinpointing all ambiguities in terms of the structure of the regular expression and report shortest ambiguous strings. We also show how pattern matching can be integrated into statically typed programming languages for deconstructing strings and reproducing typed and structured values. We validate our approach by giving a full implementation of the approach presented in this paper. The resulting tool, reg-exp-rec, adds typed and unambiguous pattern matching to Java in a stand-alone and non-intrusive manner. We evaluate the approach using several realistic examples. Keywords Regular Expressions, Pattern Matching, Ambiguity, Disambiguation, Static Analysis, Type Inference, Parsing
Claus Brabrand, Jakob G. Thomsen