Sciweavers

NAACL
2010
13 years 5 months ago
From Baby Steps to Leapfrog: How "Less is More" in Unsupervised Dependency Parsing
We present three approaches for unsupervised grammar induction that are sensitive to data complexity and apply them to Klein and Manning's Dependency Model with Valence. The ...
Valentin I. Spitkovsky, Hiyan Alshawi, Daniel Jura...
NAACL
2010
13 years 5 months ago
Language Identification: The Long and the Short of the Matter
Language identification is the task of identifying the language a given document is written in. This paper describes a detailed examination of what models perform best under diffe...
Timothy Baldwin, Marco Lui
NAACL
2010
13 years 5 months ago
An MDL-based approach to extracting subword units for grapheme-to-phoneme conversion
We address a key problem in grapheme-tophoneme conversion: the ambiguity in mapping grapheme units to phonemes. Rather than using single letters and phonemes as units, we propose ...
Sravana Reddy, John A. Goldsmith
NAACL
2010
13 years 5 months ago
Appropriately Handled Prosodic Breaks Help PCFG Parsing
This paper investigates using prosodic information in the form of ToBI break indexes for parsing spontaneous speech. We revisit two previously studied approaches, one that hurt pa...
Zhongqiang Huang, Mary P. Harper
NAACL
2010
13 years 5 months ago
Crowdsourcing the evaluation of a domain-adapted named entity recognition system
Named entity recognition systems sometimes have difficulty when applied to data from domains that do not closely match the training data. We first use a simple rule-based techniqu...
Asad B. Sayeed, Timothy J. Meyer, Hieu C. Nguyen, ...
NAACL
2010
13 years 5 months ago
Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions
We describe tree edit models for representing sequences of tree transformations involving complex reordering phenomena and demonstrate that they offer a simple, intuitive, and eff...
Michael Heilman, Noah A. Smith
NAACL
2010
13 years 5 months ago
An Unsupervised Aspect-Sentiment Model for Online Reviews
With the increase in popularity of online review sites comes a corresponding need for tools capable of extracting the information most important to the user from the plain text da...
Samuel Brody, Noemie Elhadad
NAACL
2010
13 years 5 months ago
Information Content Measures of Semantic Similarity Perform Better Without Sense-Tagged Text
This paper presents an empirical comparison of similarity measures for pairs of concepts based on Information Content. It shows that using modest amounts of untagged text to deriv...
Ted Pedersen
NAACL
2010
13 years 5 months ago
Language identification of names with SVMs
The task of identifying the language of text or utterances has a number of applications in natural language processing. Language identification has traditionally been approached w...
Aditya Bhargava, Grzegorz Kondrak
NAACL
2010
13 years 5 months ago
Efficient Parsing of Well-Nested Linear Context-Free Rewriting Systems
The use of well-nested linear context-free rewriting systems has been empirically motivated for modeling of the syntax of languages with discontinuous constituents or relatively f...
Carlos Gómez-Rodríguez, Marco Kuhlma...