This paper investigates methods to automatically infer structural information from large XML documents. Using XML as a reference format, we approach the schema generation problem ...
Polarity shifting marked by various linguistic structures has been a challenge to automatic sentiment classification. In this paper, we propose a machine learning approach to inco...
Shoushan Li, Sophia Yat Mei Lee, Ying Chen, Chu-Re...
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...
Document image segmentation algorithms primarily aim at separating text and graphics in presence of complex layouts. However, for many non-Latin scripts, segmentation becomes a ch...
There is a growing research interest in opinion retrieval as on-line users' opinions are becoming more and more popular in business, social networks, etc. Practically speakin...