Learning to summarise XML documents using content and structure

16 years 5 days ago

Download eprints.pascal-network.org

Documents formatted in eXtensible Markup Language (XML) are becoming increasingly available in collections of various document types. In this paper, we present an approach for the summarisation of XML documents. The novelty of this approach lies in that it is based on features not only from the content of documents, but also from their logical structure. We follow a sentence extraction-based summarisation method that employs a novel machine learning approach. To ﬁnd which features are more effective for producing summaries this approach views sentence extraction as an ordering task. We evaluated our summarisation model using the INEX and SUMMAC datasets. The results demonstrate that the inclusion of features from the logical structure of documents increases the effectiveness of the summariser, and that the novel machine learning approach is also effective and well-suited to the task of summarisation in the context of XML documents. Our approach is generic and is therefore applicable...

Massih-Reza Amini, Anastasios Tombros, Nicolas Usu

Real-time Traffic

CIKM 2005 | Machine Learning Approach | Sentence Extraction-based Summarisation | XML Documents |

claim paper

» Investigating the use of summarisation for interactive XML retrieval

» Extractive summarisation of legal texts

» The Use of Summaries in XML Retrieval

» On modular transformation of structural content

» Intelligent data entry assistant for XML using ensemble learning

» Combining Structure and Content Similarities for XML Document Clustering

» UML Documentation Support for XML Schema

» Ontology Learning by Analyzing XML Document Structure and Content

Post Info
More Details (n/a)

Added	26 Jun 2010
Updated	26 Jun 2010
Type	Conference
Year	2005
Where	CIKM
Authors	Massih-Reza Amini, Anastasios Tombros, Nicolas Usunier, Mounia Lalmas, Patrick Gallinari

Comments (0)

Sciweavers

Learning to summarise XML documents using content and structure

CIKM 2005 | Machine Learning Approach | Sentence Extraction-based Summarisation | XML Documents |

Explore & Download

Productivity Tools

Sciweavers