We present a highly accurate method for classifying web pages based on link percentage, which is the percentage of text characters that are parts of links normalized by the number...
Web is the most important repository of different kinds of media such as text, sound, video, images etc. Web mining is the process of applying data mining techniques to automatica...
Aggregating search results from a variety of heterogeneous sources or verticals such as news, image and video into a single interface is a popular paradigm in web search. Although...
Ke Zhou, Ronan Cummins, Mounia Lalmas, Joemon M. J...
This paper studies the problem of unified ranked retrieval of heterogeneous XML documents and Web data. We propose an effective search engine called Sailer to adaptively and versa...
As the Internet grows explosively, search engines play a more and more important role for users in effectively accessing online information. Recently, it has been recognized that ...
Ming Ji, Jun Yan, Siyu Gu, Jiawei Han, Xiaofei He,...