On Retrieving Legal Files: Shortening Documents and Weeding Out Garbage

15 years 8 months ago

Download webpages.ursinus.edu

This paper describes our participation in the TREC Legal experiments in 2007. We have applied novel normalization techniques that are designed to slightly favor longer documents instead of assuming that all documents should have equal weight. We have also developed a new method for reformulating query text when background information is provided with an information request. We have also experimented with using enhanced OCR error detection to reduce the size of the term list and remove noise in the data. In this article, we discuss the impact of these effects on the TREC 2007 data sets. We show that the use of simple normalization methods signiﬁcantly outperforms cosine normalization in the legal domain.

Scott Kulp, April Kontostathis

Real-time Traffic

Cosine Normalization | Simple Normalization Methods | TREC 2007 | TREC 2007 Data | TREC 2008 |

claim paper

Post Info
More Details (n/a)

Added	07 Nov 2010
Updated	07 Nov 2010
Type	Conference
Year	2007
Where	TREC
Authors	Scott Kulp, April Kontostathis

Comments (0)

Sciweavers

On Retrieving Legal Files: Shortening Documents and Weeding Out Garbage

Cosine Normalization | Simple Normalization Methods | TREC 2007 | TREC 2007 Data | TREC 2008 |

Explore & Download

Productivity Tools

Sciweavers