Shallow Parsing using Noisy and Non-Stationary Training Material

15 years 1 months ago

Download jmlr.csail.mit.edu

Shallow parsers are usually assumed to be trained on noise-free material, drawn from the same distribution as the testing material. However, when either the training set is noisy or else drawn from a different distributions, performance may be degraded. Using the parsed Wall Street Journal, we investigate the performance of four shallow parsers (maximum entropy, memory-based learning, N-grams and ensemble learning) trained using various types of artificially noisy material. Our first set of results show that shallow parsers are surprisingly robust to synthetic noise, with performance gradually decreasing as the rate of noise increases. Further results show that no single shallow parser performs best in all noise situations. Final results show that simple, parser-specific extensions can improve noise-tolerance. Our second set of results addresses the question of whether naturally occurring disfluencies undermines performance more than does a change in distribution. Results using the pa...

Miles Osborne

Real-time Traffic

JMLR 2002 | Parsed Wall Street | Performance | Shallow Parser |

claim paper

Added	22 Dec 2010
Updated	22 Dec 2010
Type	Journal
Year	2002
Where	JMLR
Authors	Miles Osborne

Sciweavers

Shallow Parsing using Noisy and Non-Stationary Training Material

JMLR 2002 | Parsed Wall Street | Performance | Shallow Parser |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers