The Web is rapidly moving towards a platform for mass collaboration in content production and consumption. Fresh content on a variety of topics, people, and places is being create...
Yih-Farn Robin Chen, Giuseppe Di Fabbrizio, David ...
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
In industrial manufacturing rigorous testing is used to ensure that the delivered products meet their specifications. Mechanical maladjustment or faults often show their presence t...
This paper proposes an approach to the problem of generating metadata for composite mixed-media digital objects by appropriately combining and exploiting existing knowledge or met...
When humans approach the task of text categorization, they interpret the specific wording of the document in the much larger context of their background knowledge and experience. ...