Test collection management and labeling system

14 years 7 months ago

Download ecologylab.cs.tamu.edu

In order to evaluate the performance of information retrieval and extraction algorithms, we need test collections. A test collection consists of a set of documents, a clearly formed problem that an algorithm is supposed to provide solutions to, and the answers that the algorithm should produce when executed on the documents. Defining the association between elements in the test collection and answers is known as labeling. For mainstream information retrieval problems, there are publicly available test collections which have been maintained for years. However, the scope of these problems, and thus the associated test collections, is limited. In other cases, researchers need to build, label, and manage their own test collections, which can be a tedious and error-prone task. We built test collections of HTML documents, for problems in which the answer that the algorithm supplies is a sub-tree of the DOM (Document Object Model). To lighten the burden of this task, we developed a test coll...

Eunyee Koh, Andruid Kerne, Sarah Berry

Real-time Traffic

Associated Test Collections | DOCENG 2009 | Document Analysis | Document Object Model | Test Collection |

claim paper

Post Info
More Details (n/a)

Added	28 May 2010
Updated	28 May 2010
Type	Conference
Year	2009
Where	DOCENG
Authors	Eunyee Koh, Andruid Kerne, Sarah Berry

Comments (0)

Sciweavers

Test collection management and labeling system

Associated Test Collections | DOCENG 2009 | Document Analysis | Document Object Model | Test Collection |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers