We examine some of the frequently disregarded subtleties of tokenization in Penn Treebank style, and present a new rule-based preprocessing toolkit that not only reproduces the Tr...
This paper presents a comparative study of spelling errors that are corrected as you type, vs. those that remain uncorrected. First, we generate naturally occurring online error c...
The importance of inference rules to semantic applications has long been recognized and extensive work has been carried out to automatically acquire inference-rule resources. Howe...
We investigate the problem of ordering medical events in unstructured clinical narratives by learning to rank them based on their time of occurrence. We represent each medical eve...
Preethi Raghavan, Albert M. Lai, Eric Fosler-Lussi...
This work presents a first step to a general implementation of the Semantic-Script Theory of Humor (SSTH). Of the scarce amount of research in computational humor, no research ha...
We investigate how novel English-derived words (anglicisms) are used in a Germanlanguage Internet hip hop forum, and what factors contribute to their uptake.
Some Statistical Machine Translation systems never see the light because the owner of the appropriate training data cannot release them, and the potential user of the system canno...
In social psychology, it is generally accepted that one discloses more of his/her personal information to someone in a strong relationship. We present a computational framework fo...
Extracting sentiment and topic lexicons is important for opinion mining. Previous works have showed that supervised learning methods are superior for this task. However, the perfo...
Fangtao Li, Sinno Jialin Pan, Ou Jin, Qiang Yang, ...
We investigate the consistency of human assessors involved in summarization evaluation to understand its effect on system ranking and automatic evaluation techniques. Using Text A...
Karolina Owczarzak, Peter A. Rankel, Hoa Trang Dan...