Closing the loop in webpage understanding

15 years 8 months ago

Download research.microsoft.com

The two most important tasks in information extraction from the Web are webpage structure understanding and natural language sentences processing. However, little work has been done towards an integrated statistical model for understanding webpage structures and processing natural language sentences within the HTML elements. Our recent work on webpage understanding introduces a joint model of Hierarchical Conditional Random Fields (i.e. HCRF) and extended SemiMarkov Conditional Random Fields (i.e. Semi-CRF) to leverage the page structure understanding results in free text segmentation and labeling. In this top-down integration model, the decision of the HCRF model could guide the decision-making of the Semi-CRF model. However, the drawback of the top-down integration strategy is also apparent, i.e., the decision of the Semi-CRF model could not be used by the HCRF model to guide its decision-making. This paper proposed a novel framework called WebNLP, which enables bidirectional integra...

Chunyu Yang, Yong Cao, Zaiqing Nie, Jie Zhou, Ji-R

Real-time Traffic

CIKM 2008 | Conditional Random Fields | Information Management | Page Structure Understanding | Structure Understanding |

claim paper

» Robust Line Drawing Understanding Incorporating Efficient Closed Symbols Extraction

» Closing the loop on test creation a question assessment mechanism for instructors

» A Tool and Methodology for ACStability Analysis of ContinuousTime ClosedLoop Systems

» Towards understanding architectural tradeoffs in MEMS closedloop feedback control

» Using ProblemDomain and ArtefactDomain Architectural Modelling to Understand System Evolut...

» The Relevance of Nongeneric Events in Scale Space Models

» A Hidden Markov Model for Predicting Transmembrane Helices in Protein Sequences

» Motion Planning for a Class of Planar Closedchain Manipulators

Post Info
More Details (n/a)

Added	12 Oct 2010
Updated	12 Oct 2010
Type	Conference
Year	2008
Where	CIKM
Authors	Chunyu Yang, Yong Cao, Zaiqing Nie, Jie Zhou, Ji-Rong Wen

Comments (0)

Sciweavers

Closing the loop in webpage understanding

CIKM 2008 | Conditional Random Fields | Information Management | Page Structure Understanding | Structure Understanding |

Explore & Download

Productivity Tools

Sciweavers