Malicious web pages that host drive-by-download exploits have become a popular means for compromising hosts on the Internet and, subsequently, for creating large-scale botnets. In...
Davide Canali, Marco Cova, Giovanni Vigna, Christo...
- Crawling web pages written in Arabic or any other language with limited content in the web may, at first, seem to parallel the process of crawling the English content. However, t...
Searching for a person name in a Web Search Engine usually leads to a number of web pages that refer to several people sharing the same name. In this paper we study whether it is ...
Javier Artiles, Julio Gonzalo, Enrique Amigó...
Abstract. Wikis are currently used in providing knowledge management systems for individual enterprises. The initial explanations of word entries (entities) in such a system can be...
DSMW is an extension to Semantic Mediawiki (SMW), it allows to create a network of SMW servers that share common semantic wiki pages. DSMW users can create communication channels b...
This paper deals with one aspect of the index quality of search engines: index freshness. The purpose is to analyse the update strategies of the major Web search engines Google, Y...
In this paper, a model for websites is presented. The model is well-suited for the formal verification of dynamic as well as static properties of the system. A website is defined ...
Page-based Linear Genetic Programming (GP) is proposed in which individuals are described in terms of a number of pages. Pages are expressed in terms of a fixed number of instructi...
Since WWW encourages hypertext and hypermedia document authoring (e.g. HTML or XML), Web authors tend to create documents that are composed of multiple pages connected with hyperl...
In the traditional setting, text categorization is formulated as a concept learning problem where each instance is a single isolated document. However, this perspective is not appr...