Digital Libraries have many forms – institutional libraries for information dissemination, document repositories for recordkeeping, and personal digital libraries for organizing...
Subspace learning techniques are widespread in pattern recognition research. They include PCA, ICA, LPP, etc. These techniques are generally linear and unsupervised. The problem o...
This paper describes nonparametric Bayesian treatments for analyzing records containing occurrences of items. The introduced model retains the strength of previous approaches that...
XML is fast becoming the standard format to store, exchange and publish over the web, and is getting embedded in applications. Two challenges in handling XML are its size (the XML...
Paolo Ferragina, Fabrizio Luccio, Giovanni Manzini...
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...