Most approaches to classifying media content assume a fixed, closed vocabulary of labels. In contrast, we advocate machine learning approaches which take advantage of the millions...
Large vocabulary speech recognition systems fail to recognize words beyond their vocabulary, many of which are information rich terms, like named entities or foreign words. Hybrid...
Carolina Parada, Mark Dredze, Abhinav Sethy, Ariya...
Current hidden Markov acoustic modeling for large vocabulary continuous speech recognition (LVCSR) relies on the availability of abundant labeled transcriptions. Given that speech...
Natural language technologies have been long envisioned to play a crucial role in transitioning from the current Web to a more "semantic" Web. If anything, the significa...
Peter Mika, Massimiliano Ciaramita, Hugo Zaragoza,...
The amount of text data on the Internet is growing at a very fast rate. Online text repositories for news agencies, digital libraries and other organizations currently store gigaan...