The Internet makes it possible to share and manipulate a vast quantity of information efficiently and effectively, but the rapid and chaotic growth experienced by the Net has gener...
The field of automatic genre classification has primarily focused on extracting textual features from documents. The goal of this research is to investigate whether visual feature...
Web pages are more than text and they contain much contextual and structural information, e.g., the title, the meta data, the anchor text, etc., each of which can be seen as a dat...
Abstract--In this paper we propose a new multi-view semisupervised learning algorithm called Local Co-Training (LCT). The proposed algorithm employs a set of local models with vect...