Sciweavers

2677 search results - page 25 / 536
» Extracting Structured Data from Web Pages
Sort
View
ITCC
2000
IEEE
14 years 1 months ago
Towards Knowledge Discovery from WWW Log Data
As the result of interactions between visitors and a web site, an http log file contains very rich knowledge about users on-site behaviors, which, if fully exploited, can better c...
Feng Tao, Fionn Murtagh
SIGMOD
2007
ACM
188views Database» more  SIGMOD 2007»
14 years 8 months ago
Intel Mash Maker: join the web
Intel? Mash Maker is an interactive tool that tracks what the user is doing and tries to infer what information and visualizations they might find useful for their current task. M...
Robert Ennals, Eric A. Brewer, Minos N. Garofalaki...
IAT
2007
IEEE
14 years 2 months ago
An Intelligent Web Agent to Mine Bilingual Parallel Pages via Automatic Discovery of URL Pairing Patterns
This paper describes an intelligent agent to facilitate bitext mining from the Web via automatic discovery of URL pairing patterns (or keys) for retrieving parallel web pages. The...
Chunyu Kit, Jessica Yee Ha Ng
APWEB
2010
Springer
13 years 6 months ago
ECON: An Approach to Extract Content from Web News Page
Abstract--This paper provides a simple but effective approach, named ECON, to fully-automatically extract content from Web news page. ECON uses a DOM tree to represent the Web news...
Yan Guo, Huifeng Tang, Linhai Song, Yu Wang 0009, ...
DEXA
2005
Springer
109views Database» more  DEXA 2005»
14 years 2 months ago
An XML Approach to Semantically Extract Data from HTML Tables
Abstract. Data intensive information is often published on the internet in the format of HTML tables. Extracting some of the information that is of users’ interest from the inter...
Jixue Liu, Zhuoyun Ao, Ho-Hyun Park, Yongfeng Chen