Corpus-based grammar induction generally relies on hand-parsed training data to learn the structure of the language. Unfortunately, the cost of building large annotated corpora is...
— We present a general approach for the hierarchical segmentation and labeling of document layout structures. This approach models document layout as a grammar and performs a glo...
The semantic web is expected to have an impact at least as big as that of the existing HTML based web, if not greater. However, the challenge lays in creating this semantic web an...
Background: The Clinical E-Science Framework (CLEF) project has built a system to extract clinically significant information from the textual component of medical records in order...
Angus Roberts, Robert J. Gaizauskas, Mark Hepple, ...