{"ID":2825165,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.21555","arxiv_id":"2512.21555","title":"XTrace: A Non-Invasive Dynamic Tracing Framework for Android Applications in Production","abstract":"As the complexity of mobile applications grows exponentially and the fragmentation of user device environments intensifies, ensuring online application stability faces unprecedented challenges. Traditional methods, such as static logging and post-crash analysis, lack real-time contextual information, rendering them ineffective against \"ghost bugs\" that only manifest in specific scenarios. This highlights an urgent need for dynamic runtime observability: intercepting and tracing arbitrary methods in production without requiring an app release. We propose XTrace, a novel dynamic tracing framework. XTrace introduces a new paradigm of non-invasive proxying, which avoids direct modification of the virtual machine's underlying data structures. It achieves high-performance method interception by leveraging and optimizing the highly stable, built-in instrumentation mechanism of the Android ART virtual machine. Evaluated in a ByteDance application with hundreds of millions of daily active users, XTrace demonstrated production-grade stability and performance. Large-scale online A/B experiments confirmed its stability, showing no statistically significant impact (p \u003e 0.05) on Crash User Rate or ANR rate, while maintaining minimal overhead (\u003c7 ms startup latency, \u003c0.01 ms per-method call) and broad compatibility (Android 5.0-15+). Critically, XTrace diagnosed over 11 severe online crashes and multiple performance bottlenecks, improving root-cause localization efficiency by over 90%. This confirms XTrace provides a production-grade solution that reconciles the long-standing conflict between stability and comprehensive coverage in Android dynamic tracing.","short_abstract":"As the complexity of mobile applications grows exponentially and the fragmentation of user device environments intensifies, ensuring online application stability faces unprecedented challenges. Traditional methods, such as static logging and post-crash analysis, lack real-time contextual information, rendering them ine...","url_abs":"https://arxiv.org/abs/2512.21555","url_pdf":"https://arxiv.org/pdf/2512.21555v1","authors":"[\"Qi Hu\",\"Jiangchao Liu\",\"Xin Yu\",\"Lin Zhang\",\"Edward Jiang\"]","published":"2025-12-25T08:06:21Z","proceeding":"cs.SE","tasks":"[\"cs.SE\"]","methods":"[]","has_code":false}
