— We present a general approach for the hierarchical segmentation and labeling of document layout structures. This approach models document layout as a grammar and performs a glo...
Bagging and boosting, two effective machine learning techniques, are applied to natural language parsing. Experiments using these techniques with a trainable statistical parser ar...
This paper describes a series of experiments to test the hypothesis that the parallel application of multiple NLP tools and the integration of their results improves the correctne...
In this paper, we discuss a method for representing and matching planar curves. The technique is based on using calculations from concentric circles to represent each curve by two...
Record linkage is an important data integration task that has many practical uses for matching, merging and duplicate removal in large and diverse databases. However, a quadratic ...
Timothy de Vries, Hui Ke, Sanjay Chawla, Peter Chr...