Consider a rooted tree T of arbitrary maximum degree d representing a collection of n web pages connected via a set of links, all reachable from a source home page represented by ...
Automatic classification of web pages is an effective way to deal with the difficulty of retrieving information from the Internet. Although there are many automatic classification...
A new dictionary-based text categorization approach is proposed to classify the chemical web pages efficiently. Using a chemistry dictionary, the approach can extract chemistry-re...
Chunyan Liang, Li Guo, Zhaojie Xia, Feng-Guang Nie...
Tables on web pages contain a huge amount of semantically explicit information, which makes them a worthwhile target for automatic information extraction and knowledge acquisition...
Web applications have become an invaluable source of information for many different vertical solutions, but their complex navigation and semistructured format make their informatio...