Nowadays, searches for the web pages of a person with a given name constitute a notable fraction of queries to Web search engines. Such a query would normally return web pages rela...
Dmitri V. Kalashnikov, Zhaoqi Chen, Sharad Mehrotr...
Search results generated by searchable databases are served dynamically and far larger than the static documents on the Web. These results pages have been referred to as the Deep ...
Yasuhiro Yamada, Nick Craswell, Tetsuya Nakatoh, S...
In this paper, we propose a new system extracting potentially copyright infringement texts from the Web, called EPCI. EPCI extracts them in the following way: (1) generating a set...
Takashi Tashiro, Takanori Ueda, Taisuke Hori, Yu H...
Recent advances in storage technology make it possible to store a series of large Web archives. It is now an exciting challenge for us to observe evolution of the Web. In this pap...
A large and growing number of web pages display contextual advertising based on keywords automatically extracted from the text of the page, and this is a substantial source of rev...