
13 years 7 months ago
Summarizing Contrastive Viewpoints in Opinionated Text
This paper presents a two-stage approach to summarizing multiple contrastive viewpoints in opinionated text. In the first stage, we use an unsupervised probabilistic approach to m...
Michael Paul, ChengXiang Zhai, Roxana Girju
13 years 7 months ago
Negative Training Data Can be Harmful to Text Classification
This paper studies the effects of training data on binary text classification and postulates that negative training data is not needed and may even be harmful for the task. Tradit...
Xiaoli Li, Bing Liu, See-Kiong Ng
13 years 7 months ago
Unsupervised Induction of Tree Substitution Grammars for Dependency Parsing
Inducing a grammar directly from text is one of the oldest and most challenging tasks in Computational Linguistics. Significant progress has been made for inducing dependency gram...
Phil Blunsom, Trevor Cohn
13 years 7 months ago
Simple Type-Level Unsupervised POS Tagging
Part-of-speech (POS) tag distributions are known to exhibit sparsity -- a word is likely to take a single predominant tag in a corpus. Recent research has demonstrated that incorp...
Yoong Keok Lee, Aria Haghighi, Regina Barzilay
13 years 7 months ago
Improving Gender Classification of Blog Authors
The problem of automatically classifying the gender of a blog author has important applications in many commercial domains. Existing systems mainly use features such as words, wor...
Arjun Mukherjee, Bing Liu
13 years 7 months ago
Towards Conversation Entailment: An Empirical Investigation
While a significant amount of research has been devoted to textual entailment, automated entailment from conversational scripts has received less attention. To address this limita...
Chen Zhang, Joyce Yue Chai
13 years 7 months ago
A Semi-Supervised Approach to Improve Classification of Infrequent Discourse Relations Using Feature Vector Extension
Several recent discourse parsers have employed fully-supervised machine learning approaches. These methods require human annotators to beforehand create an extensive training corp...
Hugo Hernault, Danushka Bollegala, Mitsuru Ishizuk...
13 years 7 months ago
A Fast Decoder for Joint Word Segmentation and POS-Tagging Using a Single Discriminative Model
We show that the standard beam-search algorithm can be used as an efficient decoder for the global linear model of Zhang and Clark (2008) for joint word segmentation and POS-taggi...
Yue Zhang 0004, Stephen Clark
13 years 7 months ago
Word-Based Dialect Identification with Georeferenced Rules
We present a novel approach for (written) dialect identification based on the discriminative potential of entire words. We generate Swiss German dialect words from a Standard Germ...
Yves Scherrer, Owen Rambow
13 years 7 months ago
Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification
This paper addresses the problem of learning to map sentences to logical form, given training data consisting of natural language sentences paired with logical representations of ...
Tom Kwiatkowksi, Luke S. Zettlemoyer, Sharon Goldw...