Abstract. Software maintainers routinely have to deal with a multitude of artifacts, like source code or documents, which often end up disconnected, due to their different represen...
WebFountain is a platform for very large-scale text analytics applications that allows uniform access to a wide variety of sources. It enables the deployment of a variety of docum...
Automatic and non-invasive web personalization seems to be a challenge for nowadays web sites. Many web mining techniques are used to achieve this goal. Since current web sites evo...
In this paper, a method for automatic classification of Hidden-Web databases is addressed. In our approach, the classification tree for Hidden Web databases is constructed by tailo...
The rapid growth of the World-Wide Web poses unprecedented scaling challenges for general-purpose crawlers and search engines. In this paper we describe a new hypertext resource d...
Soumen Chakrabarti, Martin van den Berg, Byron Dom