The World Wide Web provides an increasingly powerful and popular publication mechanism. Web documents often contain a large number of images serving various different purposes. Th...
Abstract. When you search for information regarding a particular person on the web, a search engine returns many pages. Some of these pages may be for people with the same name. Ho...
The publication time of a page can have a big impact on its relevance to a query, especially for time-sensitive pages such as news items. For news search engines, the publication ...
We propose mixtures of hidden Markov models for modelling clickstreams of web surfers. Hence, the page categorization is learned from the data without the need for a (possibly cumb...
Mining bilingual data (including bilingual sentences and terms1 ) from the Web can benefit many NLP applications, such as machine translation and cross language information retrie...
Long Jiang, Shiquan Yang, Ming Zhou, Xiaohua Liu, ...