We illustrate that Web searches can often be utilized to generate background text for use with text classification. This is the case because there are frequently many pages on the...
We present a description of three different algorithms that use background knowledge to improve text classifiers. One uses the background knowledge as an index into the set of tra...
Almost all document analysis approaches need to perform a global analysis of the page orientation as a separate process at an early stage. It would be preferable to estimate the o...
A new dictionary-based text categorization approach is proposed to classify the chemical web pages efficiently. Using a chemistry dictionary, the approach can extract chemistry-re...
Chunyan Liang, Li Guo, Zhaojie Xia, Feng-Guang Nie...
Consider a supervised learning problem in which examples contain both numerical- and text-valued features. To use traditional featurevector-based learning methods, one could treat...
Sofus A. Macskassy, Haym Hirsh, Arunava Banerjee, ...