Conflation Methods and Spelling Mistakes - A Sensitivity Analysis in Information Retrieval

14 years 3 months ago

Download www.dbs.cs.uni-duesseldorf.de

In some information retrieval scenarios, for example internal help desk systems, texts are entered into the document collection without proofreading. This can result in a relatively high number of spelling mistakes, which can skew the order of the documents retrieved for a query or even prevent the retrieval of relevant documents. We focus on addressing this problem at the conflation stage of the retrieval process and evaluate whether conflation based on n-grams, which is said to be insensitive to misspellings, leads to better retrieval quality than commonly used stemming algorithms. We do this by performing tests on artificially corrupted test collections and examine which characteristics of the queries and the relevant documents influence the relative retrieval quality achieved using the different conflation methods.

Philipp Dopichaj, Theo Härder

Real-time Traffic

GVD 2004 | GVD 2007 | Information Retrieval Scenarios | Relevant Documents | Retrieval Quality |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2004
Where	GVD
Authors	Philipp Dopichaj, Theo Härder

Comments (0)

Sciweavers

Conflation Methods and Spelling Mistakes - A Sensitivity Analysis in Information Retrieval

GVD 2004 | GVD 2007 | Information Retrieval Scenarios | Relevant Documents | Retrieval Quality |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers