This paper reports on the benefits of largescale statistical language modeling in machine translation. A distributed infrastructure is proposed which we use to train on up to 2 t...
Thorsten Brants, Ashok C. Popat, Peng Xu, Franz Jo...
In this paper, we address the problem of extracting data records and their attributes from unstructured biomedical full text. There has been little effort reported on this in the ...
This paper presents a syntax-driven approach to question answering, specifically the answer-sentence selection problem for short-answer questions. Rather than using syntactic fea...
Anticipating the availability of large questionanswer datasets, we propose a principled, datadriven Instance-Based approach to Question Answering. Most question answering systems ...
This paper presents Japanese morphological analysis based on conditional random fields (CRFs). Previous work in CRFs assumed that observation sequence (word) boundaries were fixed...