Learning-based summarisation of XML documents

15 years 6 months ago

Download eprints.pascal-network.org

Documents formatted in eXtensible Markup Language (XML) are available in collections of various document types. In this paper, we present an approach for the summarisation of XML documents. The novelty of this approach lies in that it is based on features not only from the content of documents, but also from their logical structure. We follow a machine learning, sentence extractionbased summarisation technique. To ﬁnd which features are more effective for producing summaries, this approach views sentence extraction as an ordering task. We evaluated our summarisation model using the INEX and SUMMAC datasets. The results demonstrate that the inclusion of features from the logical structure of documents increases the effectiveness of the summariser, and that the learnable system is also effective and well-suited to the task of summarisation in the context of XML documents. Our approach is generic, and is therefore applicable, apart from entire documents, to elements of varying granulari...

Massih-Reza Amini, Anastasios Tombros, Nicolas Usu

Real-time Traffic

Documents | Extensible Markup Language | IR 2007 | Natural Language Processing | XML Documents |

claim paper

» Learning to summarise XML documents using content and structure

» The Use of Summaries in XML Retrieval

» Extractive summarisation of legal texts

Post Info
More Details (n/a)

Added	15 Dec 2010
Updated	15 Dec 2010
Type	Journal
Year	2007
Where	IR
Authors	Massih-Reza Amini, Anastasios Tombros, Nicolas Usunier, Mounia Lalmas

Comments (0)

Sciweavers

Learning-based summarisation of XML documents

Documents | Extensible Markup Language | IR 2007 | Natural Language Processing | XML Documents |

Explore & Download

Productivity Tools

Sciweavers