Sciweavers

KDD
2009
ACM

Towards combining web classification and web information extraction: a case study

15 years 1 months ago
Towards combining web classification and web information extraction: a case study
: ? Towards Combining Web Classification and Web Information Extraction: a Case Study Ping Luo, Fen Lin, Yuhong Xiong, Yong Zhao, Zhongzhi Shi HP Laboratories HPL-2009-86 Classification, Information extraction, Graphical model Web content analysis often has two sequential and separate steps: Web Classification to identify the target Web pages and Web Information Extraction to extract the metadata contained in the target Web pages. This decoupled strategy is highly ineffective since the errors in Web classification will be propagated to Web information extraction and eventually accumulate to a high level. In this paper we study the mutual dependencies between these two steps and propose to combine them by using a model of Conditional Random Fields (CRFs). This model can be used to simultaneously recognize the target Web pages and extract the corresponding metadata. Systematic experiments in our project OfCourse for online course search show that this model significantly improves the F1 ...
Ping Luo, Fen Lin, Yuhong Xiong, Yong Zhao, Zhongz
Added 25 Nov 2009
Updated 25 Nov 2009
Type Conference
Year 2009
Where KDD
Authors Ping Luo, Fen Lin, Yuhong Xiong, Yong Zhao, Zhongzhi Shi
Comments (0)