Sciweavers

157 search results - page 5 / 32
» Automatic retargeting of web page content
Sort
View
ICWE
2009
Springer
14 years 5 months ago
A Layout-Independent Web News Article Contents Extraction Method Based on Relevance Analysis
Abstract. The traditional Web news article contents extraction methods are time-costly and need much maintenance because they analyze the layout of news pages to generate the wrapp...
Hao Han, Takehiro Tokuda
WWW
2010
ACM
14 years 5 months ago
Automatic extraction of clickable structured web contents for name entity queries
Today the major web search engines answer queries by showing ten result snippets, which need to be inspected by users for identifying relevant results. In this paper we investigat...
Xiaoxin Yin, Wenzhao Tan, Xiao Li, Yi-Chin Tu
HICSS
2009
IEEE
150views Biometrics» more  HICSS 2009»
14 years 5 months ago
An N-Gram Based Approach to Automatically Identifying Web Page Genre
The research reported in this paper is the first phase of a larger project on the automatic classification of web pages by their genres, using ngram representations of the web pag...
Jane E. Mason, Michael A. Shepherd, Jack Duffy
CEAS
2007
Springer
14 years 5 months ago
Characterizing Web Spam Using Content and HTTP Session Analysis
Web spam research has been hampered by a lack of statistically significant collections. In this paper, we perform the first large-scale characterization of web spam using conten...
Steve Webb, James Caverlee, Calton Pu
LREC
2008
160views Education» more  LREC 2008»
14 years 9 days ago
Automatic Extraction of Textual Elements from News Web Pages
In this paper we present an algorithm for automatic extraction of textual elements, namely titles and full text, associated with news stories in news web pages. We propose a super...
Hossam Ibrahim, Kareem Darwish, Abdel-Rahim Madany