{"ID":2831791,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.07776","arxiv_id":"2512.07776","title":"GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population Monitoring","abstract":"Monitoring critically endangered western lowland gorillas is currently hampered by the immense manual effort required to re-identify individuals from vast archives of camera trap footage. The primary obstacle to automating this process has been the lack of large-scale, \"in-the-wild\" video datasets suitable for training robust deep learning models. To address this gap, we introduce a comprehensive benchmark with three novel datasets: Gorilla-SPAC-Wild, the largest video dataset for wild primate re-identification to date; Gorilla-Berlin-Zoo, for assessing cross-domain re-identification generalization; and Gorilla-SPAC-MoT, for evaluating multi-object tracking in camera trap footage. Building on these datasets, we present GorillaWatch, an end-to-end pipeline integrating detection, tracking, and re-identification. To exploit temporal information, we introduce a multi-frame self-supervised pretraining strategy that leverages consistency in tracklets to learn domain-specific features without manual labels. To ensure scientific validity, a differentiable adaptation of AttnLRP verifies that our model relies on discriminative biometric traits rather than background correlations. Extensive benchmarking subsequently demonstrates that aggregating features from large-scale image backbones outperforms specialized video architectures. Finally, we address unsupervised population counting by integrating spatiotemporal constraints into standard clustering to mitigate over-segmentation. We publicly release all code and datasets to facilitate scalable, non-invasive monitoring of endangered species","short_abstract":"Monitoring critically endangered western lowland gorillas is currently hampered by the immense manual effort required to re-identify individuals from vast archives of camera trap footage. The primary obstacle to automating this process has been the lack of large-scale, \"in-the-wild\" video datasets suitable for training...","url_abs":"https://arxiv.org/abs/2512.07776","url_pdf":"https://arxiv.org/pdf/2512.07776v1","authors":"[\"Maximilian Schall\",\"Felix Leonard Knöfel\",\"Noah Elias König\",\"Jan Jonas Kubeler\",\"Maximilian von Klinski\",\"Joan Wilhelm Linnemann\",\"Xiaoshi Liu\",\"Iven Jelle Schlegelmilch\",\"Ole Woyciniuk\",\"Alexandra Schild\",\"Dante Wasmuht\",\"Magdalena Bermejo Espinet\",\"German Illera Basas\",\"Gerard de Melo\"]","published":"2025-12-08T17:58:20Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false}