In this paper, we propose a new system extracting potentially copyright infringement texts from the Web, called EPCI. EPCI extracts them in the following way: (1) generating a set...
Takashi Tashiro, Takanori Ueda, Taisuke Hori, Yu H...
In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...
It is extremely hard for a global organization with services over multiple channels to capture a consistent and unified view of its data, services, and interactions. While SOA and...
Ismail Ari, Jun Li, Riddhiman Ghosh, Mohamed Dekhi...
In this paper, we show that most multiple term queries include more than one topic and users usually reformulate their queries by topics instead of terms. In order to provide empi...
Xuefeng He, Jun Yan, Jinwen Ma, Ning Liu, Zheng Ch...
One of the main hurdles towards a wide endorsement of ontologies is the high cost of constructing them. Reuse of existing ontologies offers a much cheaper alternative than buildin...