Sciweavers

EMNLP
2009
13 years 6 months ago
Mining Search Engine Clickthrough Log for Matching N-gram Features
User clicks on a URL in response to a query are extremely useful predictors of the URL's relevance to that query. Exact match click features tend to suffer from severe data s...
Huihsin Tseng, Longbin Chen, Fan Li, Ziming Zhuang...
EMNLP
2009
13 years 6 months ago
Hypernym Discovery Based on Distributional Similarity and Hierarchical Structures
This paper presents a new method of developing a large-scale hyponymy relation database by combining Wikipedia and other Web documents. We attach new words to the hyponymy databas...
Ichiro Yamada, Kentaro Torisawa, Jun'ichi Kazama, ...
EMNLP
2009
13 years 6 months ago
Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora
A significant portion of the world's text is tagged by readers on social bookmarking websites. Credit attribution is an inherent problem in these corpora because most pages h...
Daniel Ramage, David Hall, Ramesh Nallapati, Chris...
EMNLP
2009
13 years 6 months ago
Less is More: Significance-Based N-gram Selection for Smaller, Better Language Models
The recent availability of large corpora for training N-gram language models has shown the utility of models of higher order than just trigrams. In this paper, we investigate meth...
Robert C. Moore, Chris Quirk
EMNLP
2009
13 years 6 months ago
Review Sentiment Scoring via a Parse-and-Paraphrase Paradigm
This paper presents a parse-and-paraphrase paradigm to assess the degrees of sentiment for product reviews. Sentiment identification has been well studied; however, most previous ...
Jingjing Liu, Stephanie Seneff
EMNLP
2009
13 years 6 months ago
A Syntactified Direct Translation Model with Linear-time Decoding
Recent syntactic extensions of statistical translation models work with a synchronous context-free or tree-substitution grammar extracted from an automatically parsed parallel cor...
Hany Hassan, Khalil Sima'an, Andy Way
EMNLP
2009
13 years 6 months ago
Semi-supervised Speech Act Recognition in Emails and Forums
In this paper, we present a semi-supervised method for automatic speech act recognition in email and forums. The major challenge of this task is due to lack of labeled data in the...
Minwoo Jeong, Chin-Yew Lin, Gary Geunbae Lee
EMNLP
2009
13 years 6 months ago
Segmenting Email Message Text into Zones
In the early days of email, widely-used conventions for indicating quoted reply content and email signatures made it easy to segment email messages into their functional parts. To...
Andrew Lampert, Robert Dale, Cécile Paris
EMNLP
2009
13 years 6 months ago
Web-Scale Distributional Similarity and Entity Set Expansion
Computing the pairwise semantic similarity between all words on the Web is a computationally challenging task. Parallelization and optimizations are necessary. We propose a highly...
Patrick Pantel, Eric Crestan, Arkady Borkovsky, An...