Subspace learning is very important in today's world of information overload. Distinguishing between categories within a subset of a large data repository such as the web and ...
Nandita Tripathi, Michael P. Oakes, Stefan Wermter
Web spam is a widely-recognized threat to the quality and security of the Web. Web spam pages pollute search engine indexes, burden Web crawlers and Web mining services, and expos...
The Internet consists of several billion documents. Choosing information from such a great number of Web pages is not easy. We do not think that the interfaces of traditional sear...
In the current Web, e-document has been the most common vehicle for delivering and exchanging information. As the amount of e-documents has grown enormously, effective classificati...
With a rich variety of forms and types, digital resources are complex data objects. They grows fast in volume on the Web, but hard to be classified efficiently. The paper presents ...