Subject-specific search facilities on health sites are usually built using manual inclusion and exclusion rules. These can be expensive to maintain and often provide incomplete c...
Thanh Tin Tang, David Hawking, Nick Craswell, Kath...
Implicitly structured content on the Web such as HTML tables and lists can be extremely valuable for web search, question answering, and information retrieval, as the implicit str...
Web spider is a widely used approach to obtain information for search engines. As the size of the Web grows, it becomes a natural choice to parallelize the spider’s crawling proc...
In this paper we propose a novel, scalable, clustering based Ordinal Regression formulation, which is an instance of a Second Order Cone Program (SOCP) with one Second Order Cone ...
Rashmin Babaria, J. Saketha Nath, S. Krishnan, K. ...
—We present an intelligent agent crawler designed to collect user-generated content in Second Life and related virtual worlds. The agents navigate autonomously through the world ...