Sciweavers

ML
2007
ACM

Interactive learning of node selecting tree transducer

13 years 11 months ago
Interactive learning of node selecting tree transducer
We develop new algorithms for learning monadic node selection queries in unranked trees from annotated examples, and apply them to visually interactive Web information extraction. We propose to represent monadic queries by bottom-up deterministic Node Selecting Tree Transducers (NSTTs), a particular class of tree automata that we introduce. We prove that deterministic NSTTs capture the class of queries definable in monadic second order logic (MSO) in trees, which Gottlob and Koch (2002) argue to have the right expressiveness for Web information extraction, and prove that monadic queries defined by NSTTs can be answered efficiently. We present a new polynomial time algorithm in RPNI-style that learns monadic queries defined by deterministic NSTTs from completely annotated examples, where all selected nodes are distinguished. In practice, users prefer to provide partial annotations. We propose to account for partial annotations by intelligent tree pruning heuristics. We introduce prun...
Julien Carme, Rémi Gilleron, Aurélie
Added 27 Dec 2010
Updated 27 Dec 2010
Type Journal
Year 2007
Where ML
Authors Julien Carme, Rémi Gilleron, Aurélien Lemay, Joachim Niehren
Comments (0)