Consider the task of exploring the Web in order to find pages of a particular kind or on a particular topic. This task arises in the construction of search engines and Web knowled...
The Web has established itself as the largest public data repository ever available. Even though the vast majority of information on the Web is formatted to be easily readable by ...
Hasan Davulcu, Srinivas Vadrevu, Saravanakumar Nag...
This paper describes Seeker, a platform for large-scale text analytics, and SemTag, an application written on the platform to perform automated semantic tagging of large corpora. ...
Stephen Dill, Nadav Eiron, David Gibson, Daniel Gr...
Abstract. Newistic is a web mining platform that collects and analyses documents crawled from the Internet. Although it currently processes news articles, it can be easily adapted ...
Eye tracking experiments have shown that titles of Web search results play a crucial role in guiding a user’s search process. We present a machine-learned algorithm that trains ...
Tapas Kanungo, Nadia Ghamrawi, Ki Yuen Kim, Lawren...