Statistical language modeling has been successfully used for speech recognition, part-of-speech tagging, and syntactic parsing. Recently, it has also been applied to information r...
Table of contents (TOC) recognition has attracted a great deal of attention in recent years. After reviewing the merits and drawbacks of the existing TOC recognition methods, we h...
The Machine Learning and Pattern Recognition communities are facing two challenges: solving the normalization problem, and solving the deep learning problem. The normalization pro...
For most users, authoring multimedia documents remains a complex task. One solution to deal with this problem is to provide template-based authoring tools but with the drawback of...
When search is against structured documents, it is beneficial to extract information from user queries in a format that is consistent with the backend data structure. As one step...