A Natural and Multi-layered Approach to Detect Changes in Tree-Based Textual Documents

16 years 1 months ago

Download www.dis.uniroma1.it

Several efﬁcient and very powerful algorithms exist for detecting changes in tree-based textual documents, such as those encoded in XML. An important aspect is still underestimated in their design and implementation: the quality of the output, in terms of readability, clearness and accuracy for human users. Such requirement is particularly relevant when diff-ing literary documents, such as books, articles, reviews, acts, and so on. This paper introduces the concept of ’naturalness’ in diff-ing tree-based textual documents, and discusses a new extensible set of changes which can and should be detected. A naturalness-based algorithm is presented, as well as its application for diff-ing XML-encoded legislative documents. The algorithm, called JNDiff, proved to detect signiﬁcantly better matchings (since new operations are recognized) and to be very efﬁcient.

Angelo Di Iorio, Michele Schirinzi, Fabio Vitali,

Real-time Traffic

Diff-ing Literary Documents | ICEIS 2009 | Information Systems | Tree-based Textual Documents | XML-encoded Legislative Documents |

claim paper

Post Info
More Details (n/a)

Added	23 May 2010
Updated	23 May 2010
Type	Conference
Year	2009
Where	ICEIS
Authors	Angelo Di Iorio, Michele Schirinzi, Fabio Vitali, Carlo Marchetti

Comments (0)

Sciweavers

A Natural and Multi-layered Approach to Detect Changes in Tree-Based Textual Documents

Diff-ing Literary Documents | ICEIS 2009 | Information Systems | Tree-based Textual Documents | XML-encoded Legislative Documents |

Explore & Download

Productivity Tools

Sciweavers