Sciweavers

139 search results - page 5 / 28
» An Approach to Identify Duplicated Web Pages
Sort
View
CIKM
2006
Springer
13 years 11 months ago
A fast and robust method for web page template detection and removal
The widespread use of templates on the Web is considered harmful for two main reasons. Not only do they compromise the relevance judgment of many web IR and web mining methods suc...
Karane Vieira, Altigran Soares da Silva, Nick Pint...
SIGIR
2010
ACM
13 years 8 months ago
Visual summarization of web pages
Visual summarization is an attractive new scheme to summarize web pages, which can help achieve a more friendly user experience in search and re-finding tasks by allowing users qu...
Binxing Jiao, Linjun Yang, Jizheng Xu, Feng Wu
ICNC
2005
Springer
14 years 1 months ago
Using SOFM to Improve Web Site Text Content
We introduce a new method to improve web site text content by identifying the most relevant free text in the web pages. In order to understand the variations in web page text, we c...
Sebastián A. Ríos, Juan D. Vel&aacut...
DASFAA
2005
IEEE
123views Database» more  DASFAA 2005»
13 years 9 months ago
Automatic Data Extraction from Data-Rich Web Pages
Abstract. Extracting data from web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interests. In this paper, we propose a...
Dongdong Hu, Xiaofeng Meng
JCIT
2007
91views more  JCIT 2007»
13 years 7 months ago
A Tool to Personalize the Ranking of the Documents Returned by an Internet Search Engine
Internet search engines identify web pages that contain user-specified keywords, and then rank these pages according to their (heuristically assessed) relevance to the user’s qu...
Wadee S. Alhalabi, Miroslav Kubat, Moiez A. Tapia