This paper presents PDF-TREX, an heuristic approach for table recognition and extraction from PDF documents. The heuristics starts from an initial set of basic content elements an...
—Table detection is always an important task of document analysis and recognition. In this paper, we propose a novel and effective table detection method via visual separators an...
Jing Fang, Liangcai Gao, Kun Bai, Ruiheng Qiu, Xin...
Tables are used to present, list, summarize, and structure important data in documents. In scholarly articles, they are often used to present the relationships among data and high...
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...