Traditional n-gram language models are widely used in state-of-the-art large vocabulary speech recognition systems. This simple model suffers from some limitations, such as overfi...
A bitext, or bilingual parallel corpus, consists of two texts, each one in a different language, that are mutual translations. Bitexts are very useful in linguistic engineering bec...
The Linguistic Data Consortium (LDC) is currently involved in a major effort to expand its multilingual text resources, in particular for machine translation, message understandin...
In the drive to improve patient safety, patients in modern intensive care units are closely monitored with the generation of very large volumes of data. Unless the data are further...
Through the Internet and the World-Wide Web, a vast number of information sources has become available, which offer information on various subjects by different providers, often i...