Creating a Research Collection of Question Answer Sentence Pairs with Amazon's Mechanical Turk

15 years 8 months ago

Download homepages.inf.ed.ac.uk

Each year NIST releases a set of question, document id, answer-triples for the factoid questions used in the TREC Question Answering track. While this resource is widely used and proved itself useful for many purposes, it also is too coarse a grain-size for a lot of other purposes. In this paper we describe how we have used Amazon's Mechanical Turk to have multiple subjects read the documents and identify the sentences themselves which contain the answer. For most of the 1911 questions in the test sets from 2002 to 2006 and each of the documents said to contain an answer, the Question-Answer Sentence Pairs (QASP) corpus introduced in this paper contains the identified answer sentences. We believe that this corpus, which we will make available to the public, can further stimulate research in QA, especially linguistically motivated research, where matching the question to the answer sentence by either syntactic or semantic means is a central concern.

Michael Kaisser, John Lowe

Real-time Traffic

Amazon's Mechanical Turk | Answer Sentence | Education | LREC 2008 | Question Answering Track |

claim paper

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	LREC
Authors	Michael Kaisser, John Lowe

Sciweavers

Creating a Research Collection of Question Answer Sentence Pairs with Amazon's Mechanical Turk

Amazon's Mechanical Turk | Answer Sentence | Education | LREC 2008 | Question Answering Track |

Explore & Download

Productivity Tools

Sciweavers