With currently available tools and languages, translating between an existing XML format and RDF is a tedious and error-prone task. The importance of this problem is acknowledged b...
This paper presents experiments on classifying web pages by genre. Firstly, a corpus of 1539 manually labeled web pages was prepared. Secondly, 502 genre features were selected ba...
The Online Database of Interlinear Text (ODIN)1 is a database of interlinear text "snippets", harvested mostly from scholarly documents posted to the Web. Although large...
We discuss real-world case studies involving the implementation of a web services middleware tier for the IBM Blue Gene/L supercomputer to support financial business applications...
Thomas Phan, Ramesh Natarajan, Satoki Mitsumori, H...
—Most Web and legacy paper-based documents are available in human comprehensible text form, not readily accessible to or understood by computer programs. Here, we investigate an ...