In this paper we analyze the Web coverage of three search engines, Google, Yahoo and MSN. We conducted a 15 month study collecting 15,770 Web content or information pages linked f...
Yang Sok Kim, Byeong Ho Kang, Paul Compton, Hirosh...
This paper describes how use the Java Swing HTMLEditorKit to perform multi-threaded web data mining on the EDGAR system (Electronic DataGathering, Analysis, and Retrieval system)....
The Web is nowadays moving from a Web of data to a Web of services. In this paper we present our approach for Web Service discovery on Web scale, targeted to support flexible and ...
The Internet consists of several billion documents. Choosing information from such a great number of Web pages is not easy. We do not think that the interfaces of traditional sear...
There are major trends to advance the functionality of search engines to a more expressive semantic level. This is enabled by the advent of knowledge-sharing communities such as W...