Syntactically different URLs could represent the same web page on the World Wide Web, and duplicate representation for web pages causes web applications to handle a large amount of...
The aim of the study was to determine how significance indicators assigned to different Web page elements (internal metadata, title, headings, and main text) influence automated cl...
We study estimation of mixture models for problems in which multiple views of the instances are available. Examples of this setting include clustering web pages or research papers ...
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
Web Directories are repositories of Web pages organized in a hierarchy of topics and sub-topics. In this paper, we present DirectoryRank, a ranking framework that orders the pages...
Vlassis Krikos, Sofia Stamou, Pavlos Kokosis, Alex...
Text transcoders are web–server systems that produce, on the fly, a text–only version of a web page requested by a user of a browser. Although the potential benefits of text...
Giorgio Brajnik, Daniela Cancila, Daniela Nicoli, ...
The dissemination of information available through the World Wide Web makes universal access more and more important and supports visually disabled people in their everyday life. ...
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
In this paper we describe the semantic partitioner algorithm, that uses the structural and presentation regularities of the Web pages to automatically transform them into hierarchi...
Existing search engines contain the picture of the Web from the past and their ranking algorithms are based on data crawled some time ago. However, a user requires not only relevan...