Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

157

DGO
2003

118views Education» more DGO 2003»

Reducing Storage Costs for Federated Search of Text Databases

15 years 8 months ago

Reducing Storage Costs for Federated Search of Text Databases

Download www.cs.cmu.edu

In environments containing many text search engines a federated search system provides people with a single point of access. When search engines are managed by independent organizations two key problems are discovering and representing the contents of each text database. Query-based sampling is a recent technique for discovering the contents of uncooperative databases so as to create database resource descriptions that support a variety of necessary capabilities. However, when the documents obtained by query-based sampling are very long, as is common in some government environments, disk storage costs can be surprisingly large. This paper investigates methods of pruning sampled documents to reduce storage costs. The experimental results demonstrate that disk storage costs can be reduced by 54-93% while causing only minor losses in federated search accuracy.

Jie Lu, Jamie Callan

Real-time Traffic

DGO 2003 | DGO 2007 | Disk Storage Costs | Query-based Sampling | Search Engines |

claim paper

Related Content

» KEYNOTE Keyword Search by Node Selection for Text Retrieval on DHTBased P2P Networks

» Fast online index construction by geometric partitioning

» Adaptive ClusterDistance Bounding for Nearest Neighbor Search in Image Databases

» Maximal metric margin partitioning for similarity search indexes

» STAIRS Towards Efficient FullText Filtering and Dissemination in a DHT Environment

» Load balancing for termdistributed parallel retrieval

» Learning to reduce the semantic gap in web image retrieval and annotation

» The SOWES approach to P2P web search using semantic overlays

» Efficient Batch Topk Search for Dictionarybased Entity Recognition

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2003
Where	DGO
Authors	Jie Lu, Jamie Callan

Comments (0)