As a good complement to page content, anchor texts have been extensively used, and proven to be useful, in commercial search engines. However, anchor texts have been assumed to be...
Zhicheng Dou, Ruihua Song, Jian-Yun Nie, Ji-Rong W...
Web text has been successfully used as training data for many NLP applications. While most previous work accesses web text through search engine hit counts, we created a Web Corpu...
Many web documents refer to specific geographic localities and many people include geographic context in queries to web search engines. Standard web search engines treat the geogra...
Subodh Vaid, Christopher B. Jones, Hideo Joho, Mar...
A major difference between corporate intranets and the Internet is that in intranets the barrier for users to create web pages is much higher. This limits the amount and quality o...
Pavel A. Dmitriev, Nadav Eiron, Marcus Fontoura, E...
In this paper we propose a hierarchical clustering engine, called SnakeT, that is able to organize on-the-fly the search results drawn from 16 commodity search engines into a hier...