{"ID":2839054,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.16315","arxiv_id":"2511.16315","title":"BioBench: A Blueprint to Move Beyond ImageNet for Scientific ML Benchmarks","abstract":"ImageNet-1K linear-probe transfer accuracy remains the default proxy for visual representation quality, yet it no longer predicts performance on scientific imagery. Across 46 modern vision model checkpoints, ImageNet top-1 accuracy explains only 34% of variance on ecology tasks and mis-ranks 30% of models above 75% accuracy. We present BioBench, an open ecology vision benchmark that captures what ImageNet misses. BioBench unifies 9 publicly released, application-driven tasks, 4 taxonomic kingdoms, and 6 acquisition modalities (drone RGB, web video, micrographs, in-situ and specimen photos, camera-trap frames), totaling 3.1M images. A single Python API downloads data, fits lightweight classifiers to frozen backbones, and reports class-balanced macro-F1 (plus domain metrics for FishNet and FungiCLEF); ViT-L models evaluate in 6 hours on an A6000 GPU. BioBench provides new signal for computer vision in ecology and a template recipe for building reliable AI-for-science benchmarks in any domain. Code and predictions are available at https://github.com/samuelstevens/biobench and results at https://samuelstevens.me/biobench.","short_abstract":"ImageNet-1K linear-probe transfer accuracy remains the default proxy for visual representation quality, yet it no longer predicts performance on scientific imagery. Across 46 modern vision model checkpoints, ImageNet top-1 accuracy explains only 34% of variance on ecology tasks and mis-ranks 30% of models above 75% acc...","url_abs":"https://arxiv.org/abs/2511.16315","url_pdf":"https://arxiv.org/pdf/2511.16315v1","authors":"[\"Samuel Stevens\"]","published":"2025-11-20T12:46:33Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","project_urls":"[\"https://samuelstevens.me/biobench\"]","has_code":false,"code_links":[{"ID":606836,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2839054,"paper_url":"https://arxiv.org/abs/2511.16315","paper_title":"BioBench: A Blueprint to Move Beyond ImageNet for Scientific ML Benchmarks","repo_url":"https://github.com/samuelstevens/biobench","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
