Multi-Objective $\textit{min-max}$ Online Convex Optimization

cs.LG arXiv:2510.13560
View PDF arXiv JSON

Abstract

In this paper, we broaden the horizon of online convex optimization (OCO), and consider multi-objective OCO, where there are $K$ distinct loss function sequences, and an algorithm has to choose its action at time $t$, before the $K$ loss functions at time $t$ are revealed. To capture the tradeoff between tracking the $K$ different sequences, we consider the {\it min-max} regret, where the benchmark (optimal offline algorithm) takes a static action across all time slots that minimizes the maximum of the total loss (summed across time slots) incurred by each of the $K$ sequences. An online algorithm is allowed to change its action across time slots, and its {\it min-max} regret is defined as the difference between its {\it min-max} cost and that of the benchmark. The {\it min-max} regret is a stringent performance measure and an algorithm with small regret needs to `track' all loss functions simultaneously. We first show that with adversarial input, {\it min-max} regret scales linearly with the time horizon $T$ for any online algorithm. Consequently, we consider a stochastic i.i.d. input model where all loss functions are i.i.d. generated from an unknown joint distribution and propose a simple algorithm that combines the well-known {\it Hedge} and online gradient descent (OGD) and show via a remarkably simple proof that its expected {\it min-max} regret is $O(\sqrt{T \log (T K)})$. Analogous results are also derived for Martingale difference and Markov input models.

PDF Viewer