{"ID":2900905,"CreatedAt":"2026-06-01T05:51:17.9442275Z","UpdatedAt":"2026-06-01T06:23:29.641557848Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2605.30995","arxiv_id":"2605.30995","title":"Traceable by Design: An LLM Pipeline and Dashboard for EU Regulatory Consultation Analysis","abstract":"Public consultations generate large volumes of data in the form of stakeholder submissions that are practically unfeasible to analyse manually. We present an end-to-end LLM-based pipeline and interactive dashboard for structured topic extraction from regulatory consultation submissions, demonstrated on the European Commission's Digital Fairness Act (DFA) public call for evidence as a case study. The system processes raw PDF attachments and web-form responses, extracts topic annotations, and grounds every extraction in a verbatim quote from the source text. Applied to 4,322 DFA submissions, the pipeline produced 15,368 topic annotations supported by 20,951 verbatim evidence quotes. Three principles govern the proposed design: verbatim grounding, full traceability, and transparency by design. The dashboard exposes the full extraction dataset through five analytical views, from dataset-level topic overviews to individual paragraph drill-downs, with every result traceable to its source. Beyond the predefined DFA topic categories, the pipeline generated certain stakeholder concerns, such as Age Verification, Payment Processor Censorship, and Digital Ownership, that a fixed-taxonomy approach would have missed. The pipeline is domain-generic; adapting it to a new consultation requires only a prompt update and a new dataset. A live demo is available at https://dfa-dashboard.thalesbertaglia.com/. The code and processed data are publicly available at https://github.com/thalesbertaglia/dfa-dashboard.","short_abstract":"Public consultations generate large volumes of data in the form of stakeholder submissions that are practically unfeasible to analyse manually. We present an end-to-end LLM-based pipeline and interactive dashboard for structured topic extraction from regulatory consultation submissions, demonstrated on the European Com...","url_abs":"https://arxiv.org/abs/2605.30995","url_pdf":"https://arxiv.org/pdf/2605.30995v1","authors":"[\"Thales Bertaglia\",\"Haoyang Gui\",\"Catalina Goanta\",\"Gerasimos Spanakis\"]","published":"2026-05-29T08:29:00Z","proceeding":"cs.CY","tasks":"[\"cs.CY\",\"cs.CL\"]","methods":"[\"Large Language Model\"]","project_urls":"[\"https://dfa-dashboard.thalesbertaglia.com/\"]","has_code":false,"code_links":[{"ID":612542,"CreatedAt":"2026-06-01T05:51:17.9442275Z","UpdatedAt":"2026-06-01T05:51:17.9442275Z","DeletedAt":null,"paper_id":2900905,"paper_url":"https://arxiv.org/abs/2605.30995","paper_title":"Traceable by Design: An LLM Pipeline and Dashboard for EU Regulatory Consultation Analysis","repo_url":"https://github.com/thalesbertaglia/dfa-dashboard","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
