The Lixto Project: Exploring New Frontiers of Web Data Extraction

15 years 8 months ago

Download www.dbai.tuwien.ac.at

The Lixto project is an ongoing research effort in the area of Web data extraction. Whereas the project originally started out with the idea to develop a logic-based extraction language and a tool to visually define extraction programs from sample Web pages, the scope of the project has been extended over time. Today, new issues such as employing learning algorithms for the definition of extraction programs, automatically extracting data from Web pages featuring a table-centric visual appearance, and extracting from alternative document formats such as PDF are being investigated.

Julien Carme, Michal Ceresna, Oliver Frölich,

Real-time Traffic

BNCOD 2006 | BNCOD 2007 | Extraction Programs | Logic-based Extraction Language | Web Data Extraction |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2006
Where	BNCOD
Authors	Julien Carme, Michal Ceresna, Oliver Frölich, Georg Gottlob, Tamir Hassan, Marcus Herzog, Wolfgang Holzinger, Bernhard Krüpl

Comments (0)

Sciweavers

The Lixto Project: Exploring New Frontiers of Web Data Extraction

BNCOD 2006 | BNCOD 2007 | Extraction Programs | Logic-based Extraction Language | Web Data Extraction |

Explore & Download

Productivity Tools

Sciweavers