Extracting Parallel Fragments from Comparable Corpora for Data-to-text Generation

15 years 4 months ago

Download www.aclweb.org

Building NLG systems, in particular statistical ones, requires parallel data (paired inputs and outputs) which do not generally occur naturally. In this paper, we investigate the idea of automatically extracting parallel resources for data-to-text generation from comparable corpora obtained from the Web. We describe our comparable corpus of data and texts relating to British hills and the techniques for extracting paired input/output fragments we have developed so far.

Anja Belz, Eric Kow

Real-time Traffic

Comparable Corpora | INLG 2010 | Natural Language Processing | Nlg Systems | Particular Statistical Ones |

claim paper

» Instruction Level Parallelism through Microthreading A Scalable Approach to Chip Multipro...

» Collocation Extraction Using Monolingual Word Alignment Method

» Bootstrapping FeatureRich Dependency Parsers with Entropic Priors

Post Info
More Details (n/a)

Added	13 Feb 2011
Updated	13 Feb 2011
Type	Journal
Year	2010
Where	INLG
Authors	Anja Belz, Eric Kow

Comments (0)

Sciweavers

Extracting Parallel Fragments from Comparable Corpora for Data-to-text Generation

Comparable Corpora | INLG 2010 | Natural Language Processing | Nlg Systems | Particular Statistical Ones |

Explore & Download

Productivity Tools

Sciweavers