: We describe our participation in the TREC 2004 Web and Terabyte tracks. For the web track, we employ mixture language models based on document full-text, incoming anchortext, and...
Both human users and crawlers face the problem of finding good start pages to explore some topic. We show how to assist in qualifying pages as start nodes by link-based ranking al...
Although most of existing research usually detects events by analyzing the content or structural information of Web documents, a recent direction is to study the usage data. In th...
The Web has become an excellent source for gathering consumer opinions. There are now numerous Web sites containing such opinions, e.g., customer reviews of products, forums, disc...
The number of Web-databases has exploded during the last years. In order to justify the development of new information resources, it is essential to know if the use of existing re...