{"ID":2840836,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.13944","arxiv_id":"2511.13944","title":"Find the Leak, Fix the Split: Cluster-Based Method to Prevent Leakage in Video-Derived Datasets","abstract":"We propose a cluster-based frame selection strategy to mitigate information leakage in video-derived frames datasets. By grouping visually similar frames before splitting into training, validation, and test sets, the method produces more representative, balanced, and reliable dataset partitions.","short_abstract":"We propose a cluster-based frame selection strategy to mitigate information leakage in video-derived frames datasets. By grouping visually similar frames before splitting into training, validation, and test sets, the method produces more representative, balanced, and reliable dataset partitions.","url_abs":"https://arxiv.org/abs/2511.13944","url_pdf":"https://arxiv.org/pdf/2511.13944v2","authors":"[\"Noam Glazner\",\"Noam Tsfaty\",\"Sharon Shalev\",\"Avishai Weizman\"]","published":"2025-11-17T21:57:46Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false}
