Traditionally, information extraction from web tables has focused on small, more or less homogeneous corpora, often based on assumptions about the use of <table> tags. A mul...
Tables on web pages contain a huge amount of semantically explicit information, which makes them a worthwhile target for automatic information extraction and knowledge acquisition...
Abstract. A new web content structure based on visual representation is proposed in this paper. Many web applications such as information retrieval, information extraction and auto...
Enriching digital library’s author meta-data can lead to valuable services and applications. This paper addresses the problem of extracting authors’ information from their hom...
This paper deals with studies the problem of identification and extraction of flat and nested data records from a given web page. With the explosive growth of information sources ...