Two dimensional plots (2-D) in digital documents on the web are an important source of information that is largely under-utilized. In this paper, we outline how data and text can ...
Saurabh Kataria, William Browuer, Prasenjit Mitra,...
In this paper, we propose a new system extracting potentially copyright infringement texts from the Web, called EPCI. EPCI extracts them in the following way: (1) generating a set...
Takashi Tashiro, Takanori Ueda, Taisuke Hori, Yu H...
A text retrieval method called the thematic geographical search method has been developed and applied to a Japanese encyclopedia called the World Encyclopædia. In this method, th...
Automatic processing of images of steles is a challenging problem due to the variation in their structures and body text characteristics. In this paper, area Voronoi diagram is us...
Information extraction is concerned with the location of specific items in (unstructured) textual documents, e.g., being applied for the acquisition of structured data. Then, the ...