
13 years 11 months ago
OntoNotes: Corpus Cleanup of Mistaken Agreement Using Word Sense Disambiguation
Annotated corpora are only useful if their annotations are consistent. Most large-scale annotation efforts take special measures to reconcile inter-annotator disagreement. To date...
Liang-Chih Yu, Chung-Hsien Wu, Eduard H. Hovy
13 years 11 months ago
Modeling Chinese Documents with Topical Word-Character Models
As Chinese text is written without word boundaries, effectively recognizing Chinese words is like recognizing collocations in English, substituting characters for words and words ...
Wei Hu, Nobuyuki Shimizu, Hiroshi Nakagawa, Huanye...
13 years 11 months ago
Understanding and Summarizing Answers in Community-Based Question Answering Services
Community-based question answering (cQA) services have accumulated millions of questions and their answers over time. In the process of accumulation, cQA services assume that ques...
Yuanjie Liu, Shasha Li, Yunbo Cao, Chin-Yew Lin, D...
13 years 11 months ago
A Fluid Knowledge Representation for Understanding and Generating Creative Metaphors
Creative metaphor is a phenomenon that stretches and bends the conventions of semantic description, often to humorous and poetic extremes. The computational modeling of metaphor t...
Tony Veale, Yanfen Hao
13 years 11 months ago
Weakly Supervised Supertagging with Grammar-Informed Initialization
Much previous work has investigated weak supervision with HMMs and tag dictionaries for part-of-speech tagging, but there have been no similar investigations for the harder proble...
Jason Baldridge
13 years 11 months ago
Acquiring Sense Tagged Examples using Relevance Feedback
Supervised approaches to Word Sense Disambiguation (WSD) have been shown to outperform other approaches but are hampered by reliance on labeled training examples (the data acquisi...
Mark Stevenson, Yikun Guo, Robert J. Gaizauskas
13 years 11 months ago
A Framework for Identifying Textual Redundancy
The task of identifying redundant information in documents that are generated from multiple sources provides a significant challenge for summarization and QA systems. Traditional ...
Kapil Thadani, Kathleen McKeown
13 years 11 months ago
Sentence Type Based Reordering Model for Statistical Machine Translation
Many reordering approaches have been proposed for the statistical machine translation (SMT) system. However, the information about the type of source sentence is ignored in the pr...
Jiajun Zhang, Chengqing Zong, Shoushan Li
13 years 11 months ago
An Algorithm for Adverbial Aspect Shift
The paper offers a new type of approach to the semantic phenomenon of adverbial aspect shift within the framework of finitestate temporal semantics. The heart of the proposal is a...
Sabine Gründer