There exist two types of wrappers: the string based wrapper such as the LR wrapper, and the tree based wrapper. A tree based wrapper designates extraction regions by nodes on the ...
As camera resolution increases, high-speed non-contact text capture through a digital camera is opening up a new channel for document capture and understanding. Unfortunately, per...
In this paper we describe a top-down approach to the segmentation and representation of documents containing tabular structures. Examples of these documents are invoices and techn...
Francesca Cesarini, Marco Gori, Simone Marinai, Gi...
We propose new features and algorithms for automating Web-page classification tasks such as content recommendation and ad blocking. We show that the automated classification of We...
Abstract. This paper proposes an expert peering system for information exchange. Our objective is to develop a real-time search engine for an online community where users can ask e...