Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

165

COLING
2000

97views Computational Linguistics» more COLING 2000»

Mining Tables from Large Scale HTML Texts

15 years 8 months ago

Mining Tables from Large Scale HTML Texts

Download nlg.csie.ntu.edu.tw

Table is a very common presentation scheme, but few papers touch on table extraction in text data mining. This paper focuses on mining tables from large-scale HTML texts. Table filtering, recognition, interpretation, and presentation are discussed. Heuristic rules and cell similarities are employed to identify tables. The F-measure of table recognition is 86.50%. We also propose an algorithm to capture attribute-value relationships among table cells. Finally, more structured data is extracted and presented.

Hsin-Hsi Chen, Shih-Chung Tsai, Jin-He Tsai

Real-time Traffic

COLING 2000 | COLING 2008 | Common Presentation Scheme | Table Extraction | Text Data Mining |

claim paper

Related Content

» Mining the Web for lists of Named Entities

» LargeScale Knowledge Acquisition from Botanical Texts

» Dragon Toolkit Incorporating AutoLearned Semantic Knowledge into LargeScale Text Retrieval...

» Mining InternetScale Software Repositories

» WebSets extracting sets of entities from the web using unsupervised information extraction

» Towards domainindependent information extraction from web tables

» Large Scale Parallel Document Mining for Machine Translation

» Scaling up text classification for large file systems

» Mining Console Logs for LargeScale System Problem Detection

Post Info
More Details (n/a)

Added	01 Nov 2010
Updated	01 Nov 2010
Type	Conference
Year	2000
Where	COLING
Authors	Hsin-Hsi Chen, Shih-Chung Tsai, Jin-He Tsai

Comments (0)