This paper studies automatic extraction of structured data from Web pages. Each of such pages may contain several groups of structured data records. Existing automatic methods stil...
In reverse engineering, parsing may be partially done to extract lightweight source models. Parsing code containing preprocessing directives, syntactical errors and embedded langu...
A code clone represents a sequence of statements that are duplicated in multiple locations of a program. Clones often arise in source code as a result of multiple cut/paste operat...
abstraction for modeling these problems is to view the Web as a collection of (usually small and heterogeneous) databases, and to view programs that extract and process Web data au...
Abstract. Several national statistical agencies are now releasing partially synthetic, public use microdata. These comprise the units in the original database with sensitive or ide...