Sciweavers

SIGIR
2005
ACM

Automatic web query classification using labeled and unlabeled training data

14 years 6 months ago
Automatic web query classification using labeled and unlabeled training data
Accurate topical categorization of user queries allows for increased effectiveness, efficiency, and revenue potential in general-purpose web search systems. Such categorization becomes critical if the system is to return results not just from a general web collection but from topic-specific databases as well. Maintaining sufficient categorization recall is very difficult as web queries are typically short, yielding few features per query. We examine three approaches to topical categorization of general web queries: matching against a list of manually labeled queries, supervised learning of classifiers, and mining of selectional preference rules from large unlabeled query logs. Each approach has its advantages in tackling the web query classification recall problem, and combining the three techniques allows us to classify a substantially larger proportion of queries than any of the individual techniques. We examine the performance of each approach on a real web query stream and show th...
Steven M. Beitzel, Eric C. Jensen, Ophir Frieder,
Added 26 Jun 2010
Updated 26 Jun 2010
Type Conference
Year 2005
Where SIGIR
Authors Steven M. Beitzel, Eric C. Jensen, Ophir Frieder, David A. Grossman, David D. Lewis, Abdur Chowdhury, Aleksander Kolcz
Comments (0)