Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

113

CICLING
2010
Springer

favoriteEmaildiscussreport

153views Natural Language Processing» more CICLING 2010»

A Chunk-Driven Bootstrapping Approach to Extracting Translation Patterns

14 years 9 months ago

A Chunk-Driven Bootstrapping Approach to Extracting Translation Patterns

Download www.clips.ua.ac.be

Abstract. We present a linguistically-motivated sub-sentential alignment system that extends the intersected IBM Model 4 word alignments. The alignment system is chunk-driven and requires only shallow linguistic processing tools for the source and the target languages, i.e. part-ofspeech taggers and chunkers. We conceive the sub-sentential aligner as a cascaded model consisting of two phases. In the first phase, anchor chunks are linked based on the intersected word alignments and syntactic similarity. In the second phase, we use a bootstrapping approach to extract more complex translation patterns. The results show an overall AER reduction and competitive F-Measures in comparison to the commonly used symmetrized IBM Model 4 predictions (intersection, union and grow-diag-final) on six different text types for English-Dutch. More in particular, in comparison with the intersected word alignments, the proposed method improves recall, without sacrificing precision. Moreover, the system is ...

Lieve Macken, Walter Daelemans

Real-time Traffic

CICLING 2010 | IBM Model | Intersected Word Alignments | Natural Language Processing | Word Alignments |

claim paper

Related Content

» A Bootstrapping Method for Extracting Bilingual Text Pairs

» A bootstrapping approach for identifying stakeholders in publiccomment corpora

» Semisupervised Semantic Pattern Discovery with Guidance from Unsupervised Pattern Clusters

» A Bootstrapping Approach for Geographic Named Entity Annotation

» Multiview Bootstrapping for Relation Extraction by Exploring Web Features and Linguistic F...

» Transliterated Named Entity Recognition Based on Chinese Word Sketch

» StatSnowball a statistical approach to extracting entity relationships

» A Seeddriven Bottomup Machine Learning Framework for Extracting Relations of Various Compl...

» Corroborate and learn facts from the web

Post Info
More Details (n/a)

Added	13 May 2011
Updated	13 May 2011
Type	Journal
Year	2010
Where	CICLING
Authors	Lieve Macken, Walter Daelemans

Comments (0)