{"ID":2864492,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.00058","arxiv_id":"2510.00058","title":"Variable Rate Image Compression via N-Gram Context based Swin-transformer","abstract":"This paper presents an N-gram context-based Swin Transformer for learned image compression. Our method achieves variable-rate compression with a single model. By incorporating N-gram context into the Swin Transformer, we overcome its limitation of neglecting larger regions during high-resolution image reconstruction due to its restricted receptive field. This enhancement expands the regions considered for pixel restoration, thereby improving the quality of high-resolution reconstructions. Our method increases context awareness across neighboring windows, leading to a -5.86\\% improvement in BD-Rate over existing variable-rate learned image compression techniques. Additionally, our model improves the quality of regions of interest (ROI) in images, making it particularly beneficial for object-focused applications in fields such as manufacturing and industrial vision systems.","short_abstract":"This paper presents an N-gram context-based Swin Transformer for learned image compression. Our method achieves variable-rate compression with a single model. By incorporating N-gram context into the Swin Transformer, we overcome its limitation of neglecting larger regions during high-resolution image reconstruction du...","url_abs":"https://arxiv.org/abs/2510.00058","url_pdf":"https://arxiv.org/pdf/2510.00058v2","authors":"[\"Priyanka Mudgal\"]","published":"2025-09-28T23:46:32Z","proceeding":"eess.IV","tasks":"[\"eess.IV\",\"cs.CV\",\"cs.MM\"]","methods":"[\"Transformer\"]","has_code":false}