In this paper, we investigate the use of words and subwords (including both characters and syllables) in audio indexing for Mandarin Chinese spoken document retrieval. Two retrieva...
In recent years, the language model Latent Dirichlet Allocation (LDA), which clusters co-occurring words into topics, has been widely applied in the computer vision field. Howeve...
When humans approach the task of text categorization, they interpret the specific wording of the document in the much larger context of their background knowledge and experience. ...
The Web is the archetypal information space but even on a well designed site it can be difficult to find all the information you need. It is impossible to design a site so that al...
In this article, we present a comprehensive study aimed at computing semantic relatedness of word pairs. We analyze the performance of a large number of semantic relatedness measu...