
13 years 9 months ago
Soft-Supervised Learning for Text Classification
We propose a new graph-based semisupervised learning (SSL) algorithm and demonstrate its application to document categorization. Each document is represented by a vertex within a ...
Amarnag Subramanya, Jeff Bilmes
13 years 9 months ago
Regular Expression Learning for Information Extraction
Regular expressions have served as the dominant workhorse of practical information extraction for several years. However, there has been little work on reducing the manual effort ...
Yunyao Li, Rajasekar Krishnamurthy, Sriram Raghava...
13 years 9 months ago
Joint Unsupervised Coreference Resolution with Markov Logic
Machine learning approaches to coreference resolution are typically supervised, and require expensive labeled data. Some unsupervised approaches have been proposed (e.g., Haghighi...
Hoifung Poon, Pedro Domingos
13 years 9 months ago
Graph-based Analysis of Semantic Drift in Espresso-like Bootstrapping Algorithms
Bootstrapping has a tendency, called semantic drift, to select instances unrelated to the seed instances as the iteration proceeds. We demonstrate the semantic drift of bootstrapp...
Mamoru Komachi, Taku Kudo, Masashi Shimbo, Yuji Ma...
13 years 9 months ago
One-Class Clustering in the Text Domain
Having seen a news title "Alba denies wedding reports", how do we infer that it is primarily about Jessica Alba, rather than about weddings or reports? We probably reali...
Ron Bekkerman, Koby Crammer
13 years 9 months ago
Dependency-based Semantic Role Labeling of PropBank
We present a PropBank semantic role labeling system for English that is integrated with a dependency parser. To tackle the problem of joint syntactic
Richard Johansson, Pierre Nugues
13 years 9 months ago
Adding Redundant Features for CRFs-based Sentence Sentiment Classification
In this paper, we present a novel method based on CRFs in response to the two special characteristics of "contextual dependency" and "label redundancy" in sent...
Jun Zhao, Kang Liu, Gen Wang
13 years 9 months ago
A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers
There is growing interest in applying Bayesian techniques to NLP problems. There are a number of different estimators for Bayesian models, and it is useful to know what kinds of t...
Jianfeng Gao, Mark Johnson
13 years 9 months ago
Syntactic Constraints on Paraphrases Extracted from Parallel Corpora
We improve the quality of paraphrases extracted from parallel corpora by requiring that phrases and their paraphrases be the same syntactic type. This is achieved by parsing the E...
Chris Callison-Burch