— Distributed data mining has recently caught a lot of attention as there are many cases where pooling distributed data for mining is probibited, due to either huge data volume o...
Chak-Man Lam, Xiaofeng Zhang, William Kwok-Wai Che...
Keyword searching is the most common form of document search on the Web. Many Web publishers manually annotate the META tags and titles of their pages with frequently queried phras...
Hung V. Nguyen, P. Velamuru, Deepak Kolippakkam, H...
Information extraction (IE) from semi-structured Web documents is a critical issue for information integration systems on the Internet. Previous work in wrapper induction aim to so...
This paper proposes a method for creating a high quality collection of researchers’ homepages. The proposed method consists of three phases: rough filtering of the possible web p...
Part of the process of data integration is determining which sets of identifiers refer to the same real-world entities. In integrating databases found on the Web or obtained by us...