Sciweavers

ICDE
2006
IEEE

Extracting Objects from the Web

15 years 27 days ago
Extracting Objects from the Web
Extracting and integrating object information from the Web is of great significance for Web data management. The existing Web information extraction techniques cannot provide satisfactory solution to the Web object extraction task since objects of the same type are distributed in diverse Web sources, whose structures are highly heterogeneous. In this paper, we propose a novel approach called Object-Level Information Extraction (OLIE) to extract Web objects. This approach extends a classic information extraction algorithm, Conditional Random Fields (CRF), by adding Web-specific information. The experimental results show OLIE can significantly improve the Web object extraction accuracy.
Zaiqing Nie, Fei Wu, Ji-Rong Wen, Wei-Ying Ma
Added 01 Nov 2009
Updated 01 Nov 2009
Type Conference
Year 2006
Where ICDE
Authors Zaiqing Nie, Fei Wu, Ji-Rong Wen, Wei-Ying Ma
Comments (0)