Watermarking the Outputs of Structured Prediction with an application in Statistical Machine Translation

14 years 6 months ago

Download cs.jhu.edu

We propose a general method to watermark and probabilistically identify the structured outputs of machine learning algorithms. Our method is robust to local editing operations and provides well deﬁned trade-oﬀs between the ability to identify algorithm outputs and the quality of the watermarked output. Unlike previous work in the ﬁeld, our approach does not rely on controlling the inputs to the algorithm and provides probabilistic guarantees on the ability to identify collections of results from one’s own algorithm. We present an application in statistical machine translation, where machine translated output is watermarked at minimal loss in translation quality and detected with high recall. 1 Motivation Machine learning algorithms provide structured results to input queries by simulating human behavior. Examples include automatic machine translation (Brown et al., 1993) or automatic text and rich media summarization (Goldstein et al., 1999). These algorithms often estimate so...

Ashish Venugopal, Jakob Uszkoreit, David Talbot, F

Real-time Traffic

Automatic Machine Translation | EMNLP 2011 | Input Queries | Natural Language Processing | Statistical Machine Translation |

claim paper

Added	20 Dec 2011
Updated	20 Dec 2011
Type	Journal
Year	2011
Where	EMNLP
Authors	Ashish Venugopal, Jakob Uszkoreit, David Talbot, Franz Josef Och, Juri Ganitkevitch

Sciweavers

Watermarking the Outputs of Structured Prediction with an application in Statistical Machine Translation

Automatic Machine Translation | EMNLP 2011 | Input Queries | Natural Language Processing | Statistical Machine Translation |

Explore & Download

Productivity Tools

Sciweavers