{"ID":2862836,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.26336","arxiv_id":"2509.26336","title":"UniSage: A Unified and Post-Analysis-Aware Sampling for Microservices","abstract":"Traces and logs serve as the backbone of observability in microservice architectures, yet their sheer volume imposes prohibitive storage and computational burdens. To reduce overhead, operators rely on sampling; however, current frameworks generally employ a sample-before-analysis strategy. This approach creates a fundamental trade-off: to save space, systems must discard data before knowing its diagnostic value, often losing critical context required for troubleshooting anomalies and latency spikes. In this paper, we propose UniSage, a unified sampling framework that addresses this trade-off by adopting a post-analysis-aware paradigm. Unlike prior works that focus solely on tracing, UniSageintegrates both traces and logs, leveraging a lightweight anomaly detection and root cause analysis module to scan the full data stream before sampling decisions are made. This pre-computation enables a dual-pillar strategy: an analysis-guided sampler that retains high-value data associated with detected anomalies, and an edge-case sampler that preserves rare but critical behaviors to ensure diversity. Evaluation on three datasets confirms that UniSage achieves superior data retention. At a 2.5% sampling rate, UniSage captures 71% of critical traces and 96.25% of relevant logs, substantially exceeding the best existing methods (which achieve 42.9% and 1.95%, respectively). Moreover, evaluations on a real-world dataset demonstrate UniSage's efficiency; it processes a 20-minute multi-modal data block in an average of 10 seconds, making it practical for production environments.","short_abstract":"Traces and logs serve as the backbone of observability in microservice architectures, yet their sheer volume imposes prohibitive storage and computational burdens. To reduce overhead, operators rely on sampling; however, current frameworks generally employ a sample-before-analysis strategy. This approach creates a fund...","url_abs":"https://arxiv.org/abs/2509.26336","url_pdf":"https://arxiv.org/pdf/2509.26336v2","authors":"[\"Zhouruixing Zhu\",\"Zhihan Jiang\",\"Tianyi Yang\",\"Pinjia He\"]","published":"2025-09-30T14:44:56Z","proceeding":"cs.SE","tasks":"[\"cs.SE\"]","methods":"[]","has_code":false}
