An Efficient Algorithm for Identifying the Most Contributory Substring

14 years 4 months ago

Download people.ucalgary.ca

Abstract. Detecting repeated portions of strings has important applications to many areas of study including data compression and computational biology. This paper defines and presents a solution for the Most Contributory Substring Problem, which identifies the single substring that represents the largest proportion of the characters within a set of strings. We show that a solution to the problem can be achieved with an O(n) running time (where n is the total number of characters in all of the input strings) when overlapping occurrences of the most contributory substring are permitted. Furthermore, we present an extended algorithm that does not permit occurrences of the most contributory substring to overlap. The expected running time of the extended algorithm is O(n log n) while its worst case performance is O(n2 ).

Ben Stephenson

Real-time Traffic

Contributory Substring | Contributory Substring Problem | DAWAK 2007 | Extended Algorithm | Information Management |

claim paper

Post Info
More Details (n/a)

Added	14 Aug 2010
Updated	14 Aug 2010
Type	Conference
Year	2007
Where	DAWAK
Authors	Ben Stephenson

Comments (0)

Sciweavers

An Efficient Algorithm for Identifying the Most Contributory Substring

Contributory Substring | Contributory Substring Problem | DAWAK 2007 | Extended Algorithm | Information Management |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers