Users build personal information spaces (stored as bookmarks, hotlists, or as a personal page of links) as their WWW-subset and interface to access the World-Wide Web. As the WWW ...
The increasing importance of search engines to commercial web sites has given rise to a phenomenon we call “web spam”, that is, web pages that exist only to mislead search eng...
A lot of functionality is needed when an application, such as a museum cataloguing system, is extended with semantic capabilities, for example ontological indexing functionality or...
: Web Information Systems (WISs) embody a core technology for all larger enterprises. WISs are employed in an intranet or extranet for internal information interchange or communica...
We present the design of Dynabot, a guided Deep Web discovery system. Dynabot's modular architecture supports focused crawling of the Deep Web with an emphasis on matching, p...
Daniel Rocco, James Caverlee, Ling Liu, Terence Cr...