We demonstrate a phonotactic-semantic paradigm for spoken document categorization. In this framework, we define a set of acoustic words instead of lexical words to represent acous...
There is considerable interest in interdisciplinary combinations of automatic speech recognition (ASR), machine learning, natural language processing, text classification and info...
Mark Dredze, Aren Jansen, Glen Coppersmith, Ken Wa...
We present a method for unsupervised topic modelling which adapts methods used in document classification (Blei et al., 2003; Griffiths and Steyvers, 2004) to unsegmented multi-pa...
Much has been documented in the literature on sentiment analysis and document summarisation. Much of this applies to long structured text in the form of documents and blog posts. W...
William Simm, Maria Angela Ferrario, Scott Songlin...
The use of the computing with words paradigm for the automatic text documents categorization problem is discussed. This specific problem of information retrieval (IR) becomes more...