This paper presents and compares two methods for evaluating the syntactic similarity between documents. The first method uses the Patricia tree, constructed from the original doc...
Lists of ordered objects are widely used as representational forms. Such ordered objects include Web search results or best-seller lists. Clustering is a useful data analysis tech...
Chunk parsing has focused on the recognition of partial constituent structures at the level of individual chunks. Little attention has been paid to the question of how such partia...
Existing web search engines provide users with the ability to query an off-line database of indices in order to decide on an entry point for further manual navigation. Results are...
Structured data including sets, sequences, trees and graphs, pose significant challenges to fundamental aspects of data management such as efficient storage, indexing, and simila...
Xiaohong Wang, Aaron M. Smalter, Jun Huan, Gerald ...