A Tractable Online Learning Algorithm for the Multinomial Logit Contextual Bandit release_6f3w7kmd5ff65jzyitspfjatoe

by Priyank Agrawal, Vashist Avadhanula, Theja Tulabandhula

Released as a article .



In this paper, we consider the contextual variant of the MNL-Bandit problem. More specifically, we consider a dynamic set optimization problem, where in every round a decision maker offers a subset (assortment) of products to a consumer, and observes their response. Consumers purchase products so as to maximize their utility. We assume that the products are described by a set of attributes and the mean utility of a product is linear in the values of these attributes. We model consumer choice behavior by means of the widely used Multinomial Logit (MNL) model, and consider the decision maker's problem of dynamically learning the model parameters, while optimizing cumulative revenue over the selling horizon T. Though this problem has attracted considerable attention in recent times, many existing methods often involve solving an intractable non-convex optimization problem and their theoretical performance guarantees depend on a problem dependent parameter which could be prohibitively large. In particular, existing algorithms for this problem have regret bounded by O(√(κ d T)), where κ is a problem dependent constant that can have exponential dependency on the number of attributes. In this paper, we propose an optimistic algorithm and show that the regret is bounded by O(√(dT) + κ), significantly improving the performance over existing methods. Further, we propose a convex relaxation of the optimization step which allows for tractable decision-making while retaining the favourable regret guarantee.
In text/plain format

Archived Files and Locations

There are no accessible files associated with this release. You could check other releases for this work for an accessible version.

"Dark" Archived
Save Paper Now!

Know of a fulltext copy of on the public web? Submit a URL and we will archive it

Type  article
Stage   submitted
Date   2021-03-07
Version   v3
Language   en ?
arXiv  2011.14033v3
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 5831eab0-aef7-4ec0-866f-fbc57f6cd75c