{"ID":2835498,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.23150","arxiv_id":"2511.23150","title":"Cascaded Robust Rectification for Arbitrary Document Images","abstract":"Document rectification in real-world scenarios poses significant challenges due to extreme variations in camera perspectives and physical distortions. Driven by the insight that complex transformations can be decomposed and resolved progressively, we introduce a novel multi-stage framework that progressively reverses distinct distortion types in a coarse-to-fine manner. Specifically, our framework first performs a global affine transformation to correct perspective distortions arising from the camera's viewpoint, then rectifies geometric deformations resulting from physical paper curling and folding, and finally employs a content-aware iterative process to eliminate fine-grained content distortions. To address limitations in existing evaluation protocols, we also propose two enhanced metrics: layout-aligned OCR metrics (AED/ACER) for a stable assessment that decouples geometric rectification quality from the layout analysis errors of OCR engines, and masked AD/AAD (AD-M/AAD-M) tailored for accurately evaluating geometric distortions in documents with incomplete boundaries. Extensive experiments show that our method establishes new state-of-the-art performance on multiple challenging benchmarks, yielding a substantial reduction of 14.1\\%--34.7\\% in the AAD metric and demonstrating superior efficacy in real-world applications. The code will be publicly available at https://github.com/chaoyunwang/ArbDR.","short_abstract":"Document rectification in real-world scenarios poses significant challenges due to extreme variations in camera perspectives and physical distortions. Driven by the insight that complex transformations can be decomposed and resolved progressively, we introduce a novel multi-stage framework that progressively reverses d...","url_abs":"https://arxiv.org/abs/2511.23150","url_pdf":"https://arxiv.org/pdf/2511.23150v1","authors":"[\"Chaoyun Wang\",\"Quanxin Huang\",\"I-Chao Shen\",\"Takeo Igarashi\",\"Nanning Zheng\",\"Caigui Jiang\"]","published":"2025-11-28T12:56:16Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false,"code_links":[{"ID":606515,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2835498,"paper_url":"https://arxiv.org/abs/2511.23150","paper_title":"Cascaded Robust Rectification for Arbitrary Document Images","repo_url":"https://github.com/chaoyunwang/ArbDR","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
