

Kleenex: compiling nondeterministic transducers to deterministic streaming transducers

8 years 10 months ago
Kleenex: compiling nondeterministic transducers to deterministic streaming transducers
We present and illustrate Kleenex, a language for expressing general nondeterministic finite transducers, and its novel compilation to streaming string transducers with worst-case linear-time performance and sustained high throughput. Its underlying theory is based on transducer decomposition into oracle and action machines: the oracle machine performs streaming greedy disambiguation of the input; the action machine performs the output actions. In use cases Kleenex achieves consistently high throughput rates around the 1 Gbps range on stock hardware. It performs well, especially in complex use cases, in comparison to both specialized and related tools such as AWK, sed, RE2, Ragel and regular-expression libraries. Categories and Subject Descriptors D.3.1 [Formal Definitions and Theory]: Semantics; D.3.2 [Language Classifications]: Spe
Niels Bjørn Bugge Grathwohl, Fritz Henglein
Added 09 Apr 2016
Updated 09 Apr 2016
Type Journal
Year 2016
Where POPL
Authors Niels Bjørn Bugge Grathwohl, Fritz Henglein, Ulrik Terp Rasmussen, Kristoffer Aalund Søholm, Sebastian Paaske Tørholm
Comments (0)