Hierarchical Evaluation Function: A Multi-Metric Approach for Optimizing Demand Forecasting Models
Abstract
Demand forecasting in competitive, uncertain business environments requires models that can integrate multiple evaluation perspectives rather than being restricted to hyperparameter optimization based on a single metric. This traditional approach tends to prioritize one error indicator, which can bias results when metrics provide contradictory signals. In this context, the Hierarchical Evaluation Function (HEF) is proposed as a multi-metric framework for hyperparameter optimization that integrates explanatory power (R2), sensitivity to extreme errors (RMSE), and average accuracy (MAE). The performance of HEF was assessed using four widely recognized benchmark datasets in the forecasting domain: Walmart, M3, M4, and M5. Prediction models were optimized through Grid Search, Particle Swarm Optimization (PSO), and Optuna, and statistical analyses based on difference-of-proportions tests confirmed that HEF delivers superior results compared to a unimetric reference function, regardless of the optimizer employed, with particular relevance for heterogeneous monthly time series (M3) and highly granular daily demand scenarios (M5). The findings demonstrate that HEF improves stability, generalization, and robustness at low computational cost, consolidating its role as a reliable evaluation framework that enhances model selection, enables more accurate demand forecasts, and supports decision-making in dynamic, competitive business environments.