Sciweavers

19 search results - page 2 / 4
» An N-Gram Based Approach to Automatically Identifying Web Pa...
Sort
View
AAAI
2006
13 years 11 months ago
Automatic Wrapper Generation Using Tree Matching and Partial Tree Alignment
This paper is concerned with the problem of structured data extraction from Web pages. The objective of the research is to automatically segment data records in a page, extract da...
Yanhong Zhai, Bing Liu
WWW
2010
ACM
14 years 4 months ago
Automatic extraction of clickable structured web contents for name entity queries
Today the major web search engines answer queries by showing ten result snippets, which need to be inspected by users for identifying relevant results. In this paper we investigat...
Xiaoxin Yin, Wenzhao Tan, Xiao Li, Yi-Chin Tu
ERCIMDL
2010
Springer
153views Education» more  ERCIMDL 2010»
13 years 7 months ago
Link Proximity Analysis - Clustering Websites by Examining Link Proximity
This research-in-progress paper presents a new approach called Link Proximity Analysis (LPA) for identifying related web pages based on link analysis. In contrast to current techni...
Bela Gipp, Adriana Taylor, Jöran Beel
WWW
2005
ACM
14 years 10 months ago
Web data extraction based on partial tree alignment
This paper studies the problem of extracting data from a Web page that contains several structured data records. The objective is to segment these data records, extract data items...
Yanhong Zhai, Bing Liu
KDD
2004
ACM
210views Data Mining» more  KDD 2004»
14 years 10 months ago
Web usage mining based on probabilistic latent semantic analysis
The primary goal of Web usage mining is the discovery of patterns in the navigational behavior of Web users. Standard approaches, such as clustering of user sessions and discoveri...
Xin Jin, Yanzan Zhou, Bamshad Mobasher