Crawlers in a knowledge management system need to collect and archive documents from websites, and also track the change status of these documents. However, the existence of URL r...
The Web has become the world’s largest information source. Unfortunately, the main success factor of the Web, the inherent principle of distribution and autonomy of the participa...
My research investigates into using Markov chains to make link prediction and the transition matrix derived from Markov chains to acquire structural knowledge about Web sites. The ...
There is a large amount of data that is published on the Web and several techniques have been developed to extract and integrate data from Web sources. However, Web data are inhere...
Lorenzo Blanco, Valter Crescenzi, Paolo Merialdo, ...
Querying data from presentation formats like HTML, for purposes such as information extraction, requires the consideration of tree structures as well as the consideration of spati...