Learning to rank with (a lot of) word features

13 years 10 months ago

Download ronan.collobert.com

In this article we present Supervised Semantic Indexing (SSI) which deﬁnes a class of nonlinear (quadratic) models that are discriminatively trained to directly map from the word content in a query-document or document-document pair to a ranking score. Like Latent Semantic Indexing (LSI), our models take account of correlations between words (synonymy, polysemy). However, unlike LSI our models are trained from a supervised signal directly on the ranking task of interest, which we argue is the reason for our superior results. As the query and target texts are modeled separately, our approach is easily generalized to diﬀerent retrieval tasks, such as crosslanguage retrieval or online advertising placement. Dealing with models on all pairs of words features is computationally challenging. We propose several improvements to our basic model for addressing this issue, including low rank (but diagonal preserving) representations, correlated feature hashing (CFH) and sparsiﬁcation. We pr...

Bing Bai, Jason Weston, David Grangier, Ronan Coll

Real-time Traffic

IR 2010 | Natural Language Processing | Retrieval Tasks | Semantic Indexing | Supervised Semantic Indexing |

claim paper

Post Info
More Details (n/a)

Added	28 Jan 2011
Updated	28 Jan 2011
Type	Journal
Year	2010
Where	IR
Authors	Bing Bai, Jason Weston, David Grangier, Ronan Collobert, Kunihiko Sadamasa, Yanjun Qi, Olivier Chapelle, Kilian Q. Weinberger

Comments (0)

Sciweavers

Learning to rank with (a lot of) word features

IR 2010 | Natural Language Processing | Retrieval Tasks | Semantic Indexing | Supervised Semantic Indexing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers