{"ID":2897024,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.08022","arxiv_id":"2507.08022","title":"CuriosAI Submission to the EgoExo4D Proficiency Estimation Challenge 2025","abstract":"This report presents the CuriosAI team's submission to the EgoExo4D Proficiency Estimation Challenge at CVPR 2025. We propose two methods for multi-view skill assessment: (1) a multi-task learning framework using Sapiens-2B that jointly predicts proficiency and scenario labels (43.6 % accuracy), and (2) a two-stage pipeline combining zero-shot scenario recognition with view-specific VideoMAE classifiers (47.8 % accuracy). The superior performance of the two-stage approach demonstrates the effectiveness of scenario-conditioned modeling for proficiency estimation.","short_abstract":"This report presents the CuriosAI team's submission to the EgoExo4D Proficiency Estimation Challenge at CVPR 2025. We propose two methods for multi-view skill assessment: (1) a multi-task learning framework using Sapiens-2B that jointly predicts proficiency and scenario labels (43.6 % accuracy), and (2) a two-stage pip...","url_abs":"https://arxiv.org/abs/2507.08022","url_pdf":"https://arxiv.org/pdf/2507.08022v1","authors":"[\"Hayato Tanoue\",\"Hiroki Nishihara\",\"Yuma Suzuki\",\"Takayuki Hori\",\"Hiroki Takushima\",\"Aiswariya Manojkumar\",\"Yuki Shibata\",\"Mitsuru Takeda\",\"Fumika Beppu\",\"Zhao Hengwei\",\"Yuto Kanda\",\"Daichi Yamaga\"]","published":"2025-07-08T12:33:02Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false}