It is common practice in audiovisual archives to disclose documents using metadata from a structured vocabulary or thesaurus. Many of these thesauri have limited or no structure. T...
Traditionally, popular synonym acquisition methods are based on the distributional hypothesis, and a metric such as Jaccard coefficients is used to evaluate the similarity between...
Thesaurus is a collection of words classified according to some relatedness measures among them. In this paper, we lay the theoretical foundations of thesaurus construction through...
A thesaurus and an ontology provide a set of structured terms, phrases, and metadata, often in a hierarchical arrangement, that may be used to index, search, and mine documents. W...
A catalogue holds information about a set of objects, typically classified using terms taken from a given thesaurus, and described with the help of a set of attributes. Matching a ...
This paper presents a method for automatically generating an association thesaurus from a text corpus, and demonstrates its application to information retrieval. The thesaurus gen...
Bootstrapping semantics from text is one of the greatest challenges in natural language learning. We first define a word similarity measure based on the distributional pattern of ...
Cross-language document retrieval systems require support by some kind of multilingual thesaurus for semantically indexing documents in different languages. The peculiarities of t...
This paper describes a system for linking the thesaurus of the Netherlands Institute for Sound and Vision to English WordNet and dbpedia. The thesaurus contains subject (concept) ...
A hierarchical browsing interface to a document collection can be constructed by identifying the phrases that recur in the full text of the documents and structuring them into a h...