{"ID":2867373,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.19532","arxiv_id":"2509.19532","title":"To Stream or Not to Stream: Towards A Quantitative Model for Remote HPC Processing Decisions","abstract":"Modern scientific instruments generate data at rates that increasingly exceed local compute capabilities and, when paired with the staging and I/O overheads of file-based transfers, also render file-based use of remote HPC resources impractical for time-sensitive analysis and experimental steering. Real-time streaming frameworks promise to reduce latency and improve system efficiency, but lack a principled way to assess their feasibility. In this work, we introduce a quantitative framework and an accompanying Streaming Speed Score to evaluate whether remote high-performance computing (HPC) resources can provide timely data processing compared to local alternatives. Our model incorporates key parameters including data generation rate, transfer efficiency, remote processing power, and file input/output overhead to compute total processing completion time and identify operational regimes where streaming is beneficial. We motivate our methodology with use cases from facilities such as APS, FRIB, LCLS-II, and the LHC, and validate our approach through an illustrative case study based on LCLS-II data. Our measurements show that streaming can achieve up to 97% lower end-to-end completion time than file-based methods under high data rates, while worst-case congestion can increase transfer times by over an order of magnitude, underscoring the importance of tail latency in streaming feasibility decisions.","short_abstract":"Modern scientific instruments generate data at rates that increasingly exceed local compute capabilities and, when paired with the staging and I/O overheads of file-based transfers, also render file-based use of remote HPC resources impractical for time-sensitive analysis and experimental steering. Real-time streaming...","url_abs":"https://arxiv.org/abs/2509.19532","url_pdf":"https://arxiv.org/pdf/2509.19532v2","authors":"[\"Flavio Castro\",\"Weijian Zheng\",\"Joaquin Chung\",\"Ian Foster\",\"Rajkumar Kettimuthu\"]","published":"2025-09-23T19:53:43Z","proceeding":"cs.DC","tasks":"[\"cs.DC\",\"cs.NI\"]","methods":"[]","has_code":false}
