Statistical approaches to language learning typically focus on either short-range syntactic dependencies or long-range semantic dependencies between words. We present a generative...
Thomas L. Griffiths, Mark Steyvers, David M. Blei,...
This paper presents a model for summarizing multiple untranscribed spoken documents. Without assuming the availability of transcripts, the model modifies a recently proposed unsup...
Abstract. Linguistic information can help improve evaluation of similarity between documents; however, the kind of linguistic information to be used depends on the task. In this pa...
Abstract. In this work we propose a fuzzy technique to compare XML documents belonging to a semi-structured flow and sharing a common vocabulary of tags. Our approach is based on t...
Paolo Ceravolo, Maria Cristina Nocerino, Marco Viv...
The XML format has become the standard for data exchange because it is self-describing and it stores not only information but also the relationships between data. Therefore it is u...