Sciweavers

39 search results - page 5 / 8
» Automatic Wrapper Generation Using Tree Matching and Partial...
Sort
View
WISE
2005
Springer
14 years 1 months ago
NET - A System for Extracting Web Data from Flat and Nested Data Records
This paper studies automatic extraction of structured data from Web pages. Each of such pages may contain several groups of structured data records. Existing automatic methods stil...
Bing Liu, Yanhong Zhai
WCRE
2003
IEEE
14 years 25 days ago
RegReg: a Lightweight Generator of Robust Parsers for Irregular Languages
In reverse engineering, parsing may be partially done to extract lightweight source models. Parsing code containing preprocessing directives, syntactical errors and embedded langu...
Mario Latendresse
ACMSE
2006
ACM
13 years 9 months ago
Phoenix-based clone detection using suffix trees
A code clone represents a sequence of statements that are duplicated in multiple locations of a program. Clones often arise in source code as a result of multiple cut/paste operat...
Robert Tairas, Jeff Gray
CACM
1998
110views more  CACM 1998»
13 years 7 months ago
Viewing WISs as Database Applications
abstraction for modeling these problems is to view the Web as a collection of (usually small and heterogeneous) databases, and to view programs that extract and process Web data au...
Gustavo O. Arocena, Alberto O. Mendelzon
TDP
2010
124views more  TDP 2010»
13 years 6 months ago
Random Forests for Generating Partially Synthetic, Categorical Data
Abstract. Several national statistical agencies are now releasing partially synthetic, public use microdata. These comprise the units in the original database with sensitive or ide...
Gregory Caiola, Jerome P. Reiter