We develop a generic method for the review matching problem, which is to match unstructured text reviews to a list of objects, where each object has a set of attributes. To this end, we propose a translation model for generating reviews from a structured description of objects. We develop an EM-based method to estimate the model parameters and use this model to find, given a review, the object most likely to be the topic of the review. We conduct extensive experiments on two large-scale datasets: a collection of restaurant reviews from Yelp and a collection of movie reviews from IMDb. The experiments show that our translation modelbased method is superior to traditional tf-idf based methods as well as a recent mixture model-based method for the review matching problem. Categories and Subject Descriptors. I.2.7 [Computing Methodologies]:Natural Language Processing—Language models General Terms. Algorithms, Experimentation, Measurements Keywords. Language model; review matching; tran...
Nilesh N. Dalvi, Ravi Kumar, Bo Pang, Andrew Tomki