Standard IR systems can process queries such as “web NOT internet”, enabling users who are interested in arachnids to avoid documents about computing. The documents retrieved ...
Abstract. This paper investigates a new extension of the Probabilistic Latent Semantic Analysis (PLSA) model [6] for text classification where the training set is partially labeled...
We describe a model of document citation that learns to identify hubs and authorities in a set of linked documents, such as pages retrieved from the world wide web, or papers retr...
We investigates language models for informational and navigational web search. Retrieval on the web is a task that differs substantially from ordinary ad hoc retrieval. We perfor...
During a lifecycle of a large-scale Web application, Web developers produce a wide variety of inter-related Web objects. Following good Web engineering practice, developers often ...