We propose using large-scale clustering of dependency relations between verbs and multiword nouns (MNs) to construct a gazetteer for named entity recognition (NER). Since dependen...
In this work, the problem of extracting phrase translation is formulated as an information retrieval process implemented with a log-linear model aiming for a balanced precision an...
This paper provides evidence that the use of more unlabeled data in semi-supervised learning can improve the performance of Natural Language Processing (NLP) tasks, such as part-o...
We present a method for learning bilingual translation lexicons from monolingual corpora. Word types in each language are characterized by purely monolingual features, such as con...
Aria Haghighi, Percy Liang, Taylor Berg-Kirkpatric...
We assess the current state of the art in speech summarization, by comparing a typical summarizer on two different domains: lecture data and the SWITCHBOARD corpus. Our results ca...
Phrase-based decoding produces state-of-theart translations with no regard for syntax. We add syntax to this process with a cohesion constraint based on a dependency tree for the ...
Statistical parsing of noun phrase (NP) structure has been hampered by a lack of goldstandard data. This is a significant problem for CCGbank, where binary branching NP derivation...
We define a new formalism, based on Sikkel's parsing schemata for constituency parsers, that can be used to describe, analyze and compare dependency parsing algorithms. This ...
The traditional mention-pair model for coreference resolution cannot capture information beyond mention pairs for both learning and testing. To deal with this problem, we present ...
Xiaofeng Yang, Jian Su, Jun Lang, Chew Lim Tan, Ti...
For Chinese POS tagging, word segmentation is a preliminary step. To avoid error propagation and improve segmentation by utilizing POS information, segmentation and tagging can be...