An increasing number of databases have become Web accessible through HTML form-based search interfaces. The data units returned from the underlying database are usually encoded in...
Yiyao Lu, Hai He, Hongkun Zhao, Weiyi Meng, Clemen...
A problem of supervised approaches for text classification is that they commonly require high-quality training data to construct an accurate classifier. Unfortunately, in many real...
In this work we compare different techniques to automatically find candidate web pages to substitute broken links. We extract information from the anchor text, the content of the p...
Many websites have large collections of pages generated dynamically from an underlying structured source like a database. The data of a category are typically encoded into similar...
The Web continues to grow at a tremendous rate. Search engines find it increasingly difficult to provide useful results. To manage this explosively large number of Web documents,...
Sandip Debnath, Tracy Mullen, Arun Upneja, C. Lee ...