Online Price Competition under Generalized Linear Demands
Abstract
We study a sequential price competition among $N$ sellers, each influenced by the pricing decisions of their rivals. Specifically, the demand function for each seller $i$ follows the single index model $λ_i(\mathbf p) = μ_i(\langle \boldsymbol θ_{i,0}, \mathbf p \rangle)$, with known increasing link $μ_i$ and unknown parameter $\boldsymbol θ_{i,0}$, where the vector $\mathbf{p}$ denotes the vector of prices offered by all the sellers simultaneously at a given instant. Each seller observes only their own realized demand - unobservable to competitors - and the prices set by rivals. We propose a novel decentralized policy, PML-GLUCB, that combines penalized MLE with an upper-confidence pricing rule. Our approach (i) \emph{removes the need for coordinated front-loaded exploration phases across sellers} - which is integral to previous models - making our method aligned with realistic market conditions; (ii) generalizes existing approaches that focus solely on linear demand models; (iii) accommodates both binary and real-valued demand observations. Relative to a dynamic benchmark policy, each seller achieves $\widetilde{O}(\sqrt{T})$ regret, which matches the optimal rate known in the linear setting.