Semistatic word-based byte-oriented compression codes are known to be attractive alternatives to compress natural language texts. With compression ratios around 30%, they allow di...
The problem of approximate pattern matching on hypertext is de ned and solved by Amir et al. in O(m(nlogm + e)) time, where m is the length of the pattern, n is the total text size...
In this paper we investigate a structured model for jointly classifying the sentiment of text at varying levels of granularity. Inference in the model is based on standard sequenc...
Ryan T. McDonald, Kerry Hannan, Tyler Neylon, Mike...
This paper presents a new method for producing a dictionary of subcategorization frames from unlabelled text corpora. It is shown that statistical filtering of the results of a ...
This paper presents a near real-time multilingual news monitoring and analysis system that forms the backbone of our research work. The system integrates technologies to address t...