Web pages are usually highly structured documents. In some documents, content with different functionality is laid out in blocks, some merely supporting the main discourse. In ot...
Heterogeneous entities or objects are very common and are usually interrelated with each other in many scenarios. For example, typical Web search activities involve multiple types...
In order to get high-quality web pages, search engines often resort retrieval pages by their ranks. The rank is a kind of measurement of importance of pages. Famous ranking algorit...
Guang Feng, Tie-Yan Liu, Xu-Dong Zhang, Tao Qin, B...
Our work examines Web revisitation patterns. Everybody revisits Web pages, but their reasons for doing so can differ depending on the particular Web page, their topic of interest,...
We introduce a new method to improve web site text content by identifying the most relevant free text in the web pages. In order to understand the variations in web page text, we c...