The unarguably fast, and continuous, growth of the volume of indexed (and indexable) documents on the Web poses a great challenge for search engines. This is true regarding not on...
Most classification algorithms are best at categorizing the Web documents into a few categories, such as the top two levels in the Open Directory Project. Such a classification me...
MEDLINE is a very large database of abstracts of research papers in medical domain, maintained by the National Library of Medicine. Documents in MEDLINE are supplied with manually ...
Kwangcheol Shin, Sang-Yong Han, Alexander F. Gelbu...
Real time search is an increasingly important area of information seeking on the Web. In this research, we analyze 1,005,296 user interactions with a real time search engine over ...
In this paper we are interested in describing Web pages by how users interact within their contents. Thus, an alternate but complementary way of labelling and classifying Web docu...