Sciweavers

502 search results - page 23 / 101
» Extracting Partial Structures from HTML Documents
Sort
View
AAAI
2006
13 years 10 months ago
Automatic Wrapper Generation Using Tree Matching and Partial Tree Alignment
This paper is concerned with the problem of structured data extraction from Web pages. The objective of the research is to automatically segment data records in a page, extract da...
Yanhong Zhai, Bing Liu
HICSS
2008
IEEE
105views Biometrics» more  HICSS 2008»
14 years 2 months ago
Using Visual Features for Fine-Grained Genre Classification of Web Pages
The field of automatic genre classification has primarily focused on extracting textual features from documents. The goal of this research is to investigate whether visual feature...
Ryan Levering, Michal Cutler, Lei Yu
JCDL
2006
ACM
167views Education» more  JCDL 2006»
14 years 2 months ago
Combining DOM tree and geometric layout analysis for online medical journal article segmentation
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
Jie Zou, Daniel X. Le, George R. Thoma
ICSM
2002
IEEE
14 years 1 months ago
Documenting Pattern Use in Java Programs
Design patterns are widely recognized as important software development methods. Their use as software understanding tools, though generally acknowledged has been scarcely explore...
Marco Torchiano
HT
2004
ACM
14 years 1 months ago
Hypertext versioning for embedded link models
In this paper, we describe Chrysant, a hypertext version control system for embedded link models. Chrysant provides generalpurpose versioning capability to hypertext systems with ...
Kai Pan, E. James Whitehead Jr., Guozheng Ge