Parallel corpora are critical resources for machine translation research and development since parallel corpora contain translation equivalences of various granularities. Manual a...
— We present a general approach for the hierarchical segmentation and labeling of document layout structures. This approach models document layout as a grammar and performs a glo...
Hybrid of Neural Network (NN) and Hidden Markov Model (HMM) has been popular in word recognition, taking advantage of NN discriminative property and HMM representational capabilit...
Abdul Rahim Ahmad, Christian Viard-Gaudin, Marzuki...
We propose a general method to watermark and probabilistically identify the structured outputs of machine learning algorithms. Our method is robust to local editing operations and...
Ashish Venugopal, Jakob Uszkoreit, David Talbot, F...
Minimum Error Rate Training (MERT) is an effective means to estimate the feature function weights of a linear model such that an automated evaluation criterion for measuring syste...
Wolfgang Macherey, Franz Josef Och, Ignacio Thayer...