We develop new algorithms for learning monadic node selection queries in unranked trees from annotated examples, and apply them to visually interactive Web information extraction. ...
We present new techniques for supervised wrapper generation and automated web information extraction, and a system called Lixto implementing these techniques. Our system can gener...
Abstract. A base problem in Web information extraction is to find appropriate queries for informative nodes in trees. We propose to learn queries for nodes in trees automatically ...
: ? Towards Combining Web Classification and Web Information Extraction: a Case Study Ping Luo, Fen Lin, Yuhong Xiong, Yong Zhao, Zhongzhi Shi HP Laboratories HPL-2009-86 Classific...
This paper outlines our approach to the creation of annotated corpora for the purposes of Web Information Extraction, and presents the Web Annotation tool. This tool enables the a...
The Web contains an abundance of useful semistructured information about real world objects, and our empirical study shows that strong sequence characteristics exist for Web infor...
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Y...
Extracting and integrating object information from the Web is of great significance for Web data management. The existing Web information extraction techniques cannot provide sati...