The paper considers a stylized model of a dynamic assortment optimization problem, where given a limited capacity constraint, we must decide the assortment of products to offer to customers to maximize the profit. Our model is motivated by the problem faced by retailers of stocking products on a shelf with limited capacities and by the problem of placing a limited number of ads on a web page. We assume that each customer chooses to purchase the product (or to click on the ad) that maximizes her utility. We use the multinomial logit choice model to represent demand. However, we do not know the demand for each product. We can learn the demand distribution by offering different product assortments, observing resulting selections, and inferring the demand distribution from past selections and assortment decisions. We present an adaptive policy for joint parameter estimation and assortment optimization. To evaluate our proposed policy, we define a benchmark profit as the maximum expected p...
Paat Rusmevichientong, Zuo-Jun Max Shen, David B.