Sciweavers

146 search results - page 15 / 30
» WebBase: a repository of Web pages
Sort
View
AIRWEB
2009
Springer
14 years 3 months ago
Looking into the past to better classify web spam
Web spamming techniques aim to achieve undeserved rankings in search results. Research has been widely conducted on identifying such spam and neutralizing its influence. However,...
Na Dai, Brian D. Davison, Xiaoguang Qi
IADIS
2004
13 years 10 months ago
Simplifying the Clickstream Retrieval Using Weblogger Tool
Data Webhouses are used to retain all the information related to web user's behavior within a web site, working as a shared repository of business data. The advent of e-busin...
João Silva, João Bernardino
ER
2009
Springer
104views Database» more  ER 2009»
14 years 3 months ago
Modelling Safe Interface Interactions in Web Applications
Abstract. Current Web applications embed sophisticated user interfaces and business logic. The original interaction paradigm of the Web based on static content pages that are brows...
Marco Brambilla, Jordi Cabot, Michael Grossniklaus
WWW
2007
ACM
14 years 9 months ago
Detecting near-duplicates for web crawling
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma
KES
2006
Springer
13 years 8 months ago
Integrated Document Browsing and Data Acquisition for Building Large Ontologies
Named entities (e.g., "Kofi Annan", "Coca-Cola", "Second World War") are ubiquitous in web pages and other types of document and often provide a simpl...
Felix Weigel, Klaus U. Schulz, Levin Brunner, Edua...