The ability to automatically record the marks applied to paper documents on their electronic originals would preserve the information represented by those annotations. Users could even lose the original paper document. The marked-up version could be re-generated by merely re-printing it. We describe a solution that saves an electronic representation for the highlights users commonly apply over the top of machine-printed text. A unique combination of algorithms is presented that maps the image captured from a pen scanner affixed to a highlighting pen onto text strings in electronic documents. Documents are automatically located in a large database using characteristics of the highlighted text. We describe here the system components, including the image recognition algorithms, and discuss their performance in finding a unique mapping from an image of text onto a sequence of words in an electronic document within a large database.
Jonathan J. Hull, Dar-Shyang Lee