Wikipedia is the largest monolithic repository of human knowledge. In addition to its sheer size, it represents a new encyclopedic paradigm by interconnecting articles through hyp...
We study in this paper the Web forum crawling problem, which is a very fundamental step in many Web applications, such as search engine and Web data mining. As a typical user-crea...
Rui Cai, Jiang-Ming Yang, Wei Lai, Yida Wang, Lei ...
Web is the most important repository of different kinds of media such as text, sound, video, images etc. Web mining is the process of applying data mining techniques to automatica...
This paper is to investigate the group behavior patterns of search activities based on Web search history data, i.e., clickthrough data, to boost search performance. We propose a ...
Although many algorithms have been developed to harvest lexical resources, few organize the mined terms into taxonomies. We propose (1) a semi-supervised algorithm that uses a roo...