Sciweavers

SYNASC
2006
IEEE

HTML Pattern Generator--Automatic Data Extraction from Web Pages

14 years 6 months ago
HTML Pattern Generator--Automatic Data Extraction from Web Pages
Existing methods of information extraction from HTML documents include manual approach, supervised learning and automatic techniques. The manual method has high precision and recall values but it is difficult to apply it for large number of pages. Supervised learning involves human interaction to create positive and negative samples. Automatic techniques benefit from less human effort but they are not highly reliable regarding the information retrieved.
Mirel Cosulschi, Adrian Giurca, Bogdan Udrescu, Ni
Added 12 Jun 2010
Updated 12 Jun 2010
Type Conference
Year 2006
Where SYNASC
Authors Mirel Cosulschi, Adrian Giurca, Bogdan Udrescu, Nicolae Constantinescu, Mihai Gabroveanu
Comments (0)