{"ID":3005040,"CreatedAt":"2026-06-03T03:09:48.883664427Z","UpdatedAt":"2026-06-05T07:16:01.131756733Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.03266","arxiv_id":"2606.03266","title":"ReforMe: Re-Shaping Documents with Contextual Prompting and Layout-Aware Propagation","abstract":"Digitizing complex documents with handwritten content, irregular tables, and heterogeneous layouts remains challenging, as traditional Optical Character Recognition (OCR) systems fail to capture writing nuances, author-specific conventions, and document structure, and recent LLM-based approaches lack mechanisms for precise, scalable correction. We present an interactive document digitization system that integrates layout-aware parsing, OCR, and LLM-based reconstruction with user-driven refinement. The system is informed by a formative study that identifies key challenges and interaction needs in real-world digitization workflows. It supports both direct edits and natural-language instructions, and introduces a layout-aware propagation mechanism that generalizes user corrections across structurally similar regions. This enables not only efficient error correction but also document re-shaping into structured, analyzable representations. We evaluate the system through a within-subjects user study (n=12) on real-world documents. Results show improved correction efficiency and reduced repetitive effort, demonstrating more effective and controllable document digitization procedure.","short_abstract":"Digitizing complex documents with handwritten content, irregular tables, and heterogeneous layouts remains challenging, as traditional Optical Character Recognition (OCR) systems fail to capture writing nuances, author-specific conventions, and document structure, and recent LLM-based approaches lack mechanisms for pre...","url_abs":"https://arxiv.org/abs/2606.03266","url_pdf":"https://arxiv.org/pdf/2606.03266v1","authors":"[\"Nabin Khanal\",\"Tongyan Wang\",\"Jui-Cheng Chiu\",\"Ningning Nicole Kong\",\"Hannah Yanhua Zong\",\"Yingjie Victor Chen\"]","published":"2026-06-02T07:31:46Z","proceeding":"cs.HC","tasks":"[\"cs.HC\"]","methods":"[\"Large Language Model\"]","has_code":false}
