Sciweavers

EUROSYS
2006
ACM

Ferret: a toolkit for content-based similarity search of feature-rich data

14 years 8 months ago
Ferret: a toolkit for content-based similarity search of feature-rich data
Building content-based search tools for feature-rich data has been a challenging problem because feature-rich data such as audio recordings, digital images, and sensor data are inherently noisy and high dimensional. Comparing noisy data requires comparisons based on similarity instead of exact matches, and thus searching for noisy data requires similarity search instead of exact search. The Ferret toolkit is designed to help system builders quickly construct content-based similarity search systems for feature-rich data types. The key component of the toolkit is a content-based similarity search engine for generic, multifeature object representations. To solve the similarity search problem in high-dimensional spaces, we have developed approximation methods inspired by recent theoretical results on dimension reduction. The search engine constructs sketches from feature vectors as highly compact data structures for matching, filtering and ranking data objects. The toolkit also includes ...
Qin Lv, William Josephson, Zhe Wang, Moses Charika
Added 10 Mar 2010
Updated 10 Mar 2010
Type Conference
Year 2006
Where EUROSYS
Authors Qin Lv, William Josephson, Zhe Wang, Moses Charikar, Kai Li
Comments (0)