Multi-document summarization is a challenge to information overload problem to provide a condensed text for a number of documents. Most multi-document summarization systems make u...
We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, w...
Named entity recognition aims at extracting named entities from unstructured text. A recent trend of named entity recognition is finding approximate matches in the text with respe...
Wei Wang 0011, Chuan Xiao, Xuemin Lin, Chengqi Zha...
Many of today’s web sites contain substantial amounts of client-side code, and consequently, they act more like programs than simple documents. This creates robustness and perfo...
For privacy reasons, sensitive content may be revised before it is released. The revision often consists of redaction, that is, the “blacking out” of sensitive words and phras...