Automatic recognition of named entities such as people, places, organizations, books, and movies across the entire web presents a number of challenges, both of scale and scope. Da...
Casey Whitelaw, Alexander Kehlenbeck, Nemanja Petr...
It is crucial in many information systems to organize short text segments, such as keywords in documents and queries from users, into a well-formed topic hierarchy. In this paper,...
Combating Web spam is one of the greatest challenges for Web search engines. State-of-the-art anti-spam techniques focus mainly on detecting varieties of spam strategies, such as ...
Chao Wei, Yiqun Liu, Min Zhang, Shaoping Ma, Liyun...
Abstract. Nowadays, the demand on querying and searching the Semantic Web is increasing. Some systems have adopted IR (Information Retrieval) approaches to index and search the Sem...
Yan Liang, Haofen Wang, Qiaoling Liu, Thanh Tran, ...
Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...