Term weighting strongly influences the performance of text mining and information retrieval approaches. Usually term weights are determined through statistical estimates based on s...
Combating Web spam has become one of the top challenges for Web search engines. State-of-the-art spam detection techniques are usually designed for specific known types of Web spa...
Yiqun Liu, Rongwei Cen, Min Zhang, Shaoping Ma, Li...
An important aspect of Semantic Web technologies is the issue of identity and uniquely identifying resources, which is essential for integrating data across sources. Currently, th...
Dating of contents is relevant to multiple advanced Natural Language Processing (NLP) applications, such as Information Retrieval or Question Answering. These could be improved by...
This paper explores the concept of early discard for interactive search of unindexed data. Processing data inside storage devices using downloaded searchlet code enables Diamond t...
Larry Huston, Rahul Sukthankar, Rajiv Wickremesing...