Many content-oriented applications require a scalable text index. Building such an index is challenging. In addition to the logic of inserting and searching documents, developers ...
Abstract. Textual reuse is an integral part of textual case-based reasoning (TCBR) which deals with solving new problems by reusing previous similar problem-solving experiences doc...
Ibrahim Adeyanju, Nirmalie Wiratunga, Juan A. Reci...
For discrete co-occurrence data like documents and words, calculating optimal projections and clustering are two different but related tasks. The goal of projection is to find a ...
Shipeng Yu, Kai Yu, Volker Tresp, Hans-Peter Krieg...
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
In this paper the development of an opinion summarization system that works on Bengali News corpus has been described. The system identifies the sentiment information in each docu...