Sciweavers

DEXAW
2008
IEEE

Self-Similarity Metric for Index Pruning in Conceptual Vector Space Models

14 years 6 months ago
Self-Similarity Metric for Index Pruning in Conceptual Vector Space Models
— One of the critical issues in search engines is the size of search indexes: as the number of documents handled by an engine increases, the search must preserve its efficiency, despite the growth of indexing structures. A widely agreed solution to this problem is the adoption of smaller, or pruned, indexes that allow increasing the retrieval speed while keeping the search quality as high as possible. This paper extends the notion of pruned index to semantic search systems based on conceptual vector space models and proposes a new self-similarity metric for index pruning. A conceptual vector space model represents documents as vectors in a n-dimensional space where each dimension corresponds to an ontology concept. The pruning algorithm proposed in this paper acts on the basis of document self-similarity, preserving only the most significant components of a document conceptual vector. Unlike many already proposed algorithms, the self-similarity metric is only based on local informa...
Dario Bonino, Fulvio Corno
Added 29 May 2010
Updated 29 May 2010
Type Conference
Year 2008
Where DEXAW
Authors Dario Bonino, Fulvio Corno
Comments (0)