Abstract. Data intensive information is often published on the internet in the format of HTML tables. Extracting some of the information that is of users’ interest from the inter...
Jixue Liu, Zhuoyun Ao, Ho-Hyun Park, Yongfeng Chen
This paper proposes a reactive website design strategy based on two complementary website analyses. An analysis of 15 Swiss hotels' combined log files – 345’440 web site ...
Roland Schegg, Thomas Steiner, Thouraya Gherissi-L...
Since syntactically different URLs could represent the same resource in WWW, there are on-going efforts to define the URL normalization in the standard communities. This paper cons...
We present an approach to automatically retrieve and extract lyrics of arbitrary songs from the Internet. It is intended to provide easy and convenient access to lyrics for users,...
Current peer-to-peer (p2p) full-text keyword search techniques fall into the following categories: document-based partitioning, keyword-based partitioning, hybrid indexing, and se...