Abstract. Current text classification systems typically use term stems for representing document content. Semantic Web technologies allow the usage of features on a higher semantic...
We propose to combine two approaches for modeling data admitting sparse representations: on the one hand, dictionary learning has proven effective for various signal processing ta...
Bayesian text classifiers face a common issue which is referred to as data sparsity problem, especially when the size of training data is very small. The frequently used Laplacian...
This paper describes a new versatile algorithm for correcting nonlinear distortions, such as curvature of book pages, in camera based document processing. We introduce the idea of...
—With the rapid development of the World Wide Web, people benefit more and more from the sharing of information. However, Web pages with obscene, harmful, or illegal content can ...
Weiming Hu, Ou Wu, Zhouyao Chen, Zhouyu Fu, Stephe...