Text classification categories Web documents in large collections into predefined classes based on their contents. Unfortunately, the classification process can be time-consumi...
Recent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchron...
Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey...
In the AllRight project, we are developing an algorithm for unsupervised table detection and segmentation that uses the visual rendition of a Web page rather than the HTML code. O...
There have been recent improvements in document technologies like the standardization of object interfaces to access and manipulate the properties of web documents. There has also...
Abstract. We present a dialogue system that enables the access in natural language to a web information retrieval system. We use a Web Semantic Language to model the knowledge conv...