Today, two classes of indexing methods enjoying wide applicability are the Inverted Index and the Superimposed Coding based Signature File (SC-SF). The former is most efficient in query processing but utilizes extra storage of size comparable to that of the textbase, whereas the latter is most efficient in storage utilization. The present study builds upon the results obtained in previous research [2], and proposes a hybrid structure for text retrieval. The new structure is labelled S-Index and is shown to be of a tunable performance which ranges between two extreme ends. At the one extreme end, S-Index turns into a Signature File, which involves zero information loss and, in this respect, it is faster than the ordinary SC-SF method. At the other extreme end, S-Index becomes an Inverted Index. The advantage of the proposed access method is that frequently queried sections of text are indexed via an Inverted Index, whereas the bulk of the textbase, which is not frequently targeted by...
Dimitrios Dervos, P. Linardis, Yannis Manolopoulos