The quality of a statistical machine translation (SMT) system is heavily dependent upon the amount of parallel sentences used in training. In recent years, there have been several...
The longstanding problem of automatic table interpretation still illudes us. Its solution would not only be an aid to table processing applications such as large volume table conve...
We consider the problem of content extraction from online news webpages. To explore to what extent the syntactic markup and the visual structure of a webpage facilitate the extrac...
Music and songs usually have repeating patterns and prominent structure. The automatic extraction of such repeating patterns and structure is useful for further music summarizatio...
It is important to automatically extract key information from sensitive text documents for intelligence analysis. Text documents are usually unstructured and information extraction...