As unlexicalized parsing lacks word token information, it is important to investigate novel parsing features to improve the accuracy. This paper studies a set of tree topological ...
Samuel W. K. Chan, Lawrence Y. L. Cheung, Mickey W...
This paper presents a novel method for acquiring a set of query patterns to retrieve documents containing important information about an entity. Given an existing Wikipedia catego...
A weakly supervised method uses anonymized search queries to induce a ranking among class labels extracted from unstructured text for various instances. The accuracy of the extrac...
We investigate the effectiveness of different linguistic cues for distinguishing literal and non-literal usages of potentially idiomatic expressions. We focus specifically on feat...
We propose a methodology for investigating how well NLP systems handle meaning preserving syntactic variations. We start by presenting a method for the semi automated creation of ...
Verb suffixes and verb complexes of morphologically rich languages carry a lot of information. We show that this information if harnessed for the task of shallow parsing can lead ...
Harshada Gune, Mugdha Bapat, Mitesh M. Khapra, Pus...
Information Extraction (IE) technology is facing new challenges of dealing with large-scale heterogeneous data sources from different documents, languages and modalities. Informat...
In this paper we look at the problem of cleansing noisy text using a statistical machine translation model. Noisy text is produced in informal communications such as Short Message...
Danish Contractor, Tanveer A. Faruquie, L. Venkata...
There often exist multiple corpora for the same natural language processing (NLP) tasks. However, such corpora are generally used independently due to distinctions in annotation s...