We propose a novel measure of the representativeness (i.e., indicativeness or topic specificity) of a term in a given corpus. The measure embodies the idea that the distribution o...
Extracting sentences that contain important information from a document is a form of text summarization. The technique is the key to the automatic generation of summaries similar ...
We propose a method "Interactive Paraphrasing" which enables users to interactively paraphrase words in a document by their definitions, making use of syntactic annotati...
We present a quantitative model of word order and movement constraints that enables a simple and uniform treatment of a seemingly heterogeneous collection of linear order phenomena...
Zipf's law states that the frequency of word tokens in a large corpus of natural language is inversely proportional to the rank. The law is investigated for two languages Eng...
Le Quan Ha, Elvira I. Sicilia-Garcia, Ji Ming, F. ...
We evaluate probabilistic models of verb argument structure trained on a corpus of verbs and their syntactic arguments. Models designed to represent patterns of verb alternation b...
We profile the occurrence of clausal extraposition in corpora from different domains and demonstrate that extraposition is a pervasive phenomenon in German that must be addressed ...
Michael Gamon, Eric K. Ringger, Zhu Zhang, Robert ...
Speech dialog systems need to deal with various kinds of ill-formed speech inputs that appear in natural human-human dialog. Self-correction (or speech-repair) is a particularly p...