{"ID":2885879,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.04592","arxiv_id":"2508.04592","title":"Face-voice Association in Multilingual Environments (FAME) 2026 Challenge Evaluation Plan","abstract":"The advancements of technology have led to the use of multimodal systems in various real-world applications. Among them, audio-visual systems are among the most widely used multimodal systems. In the recent years, associating face and voice of a person has gained attention due to the presence of unique correlation between them. The Face-voice Association in Multilingual Environments (FAME) 2026 Challenge focuses on exploring face-voice association under the unique condition of a multilingual scenario. This condition is inspired from the fact that half of the world's population is bilingual and most often people communicate under multilingual scenarios. The challenge uses a dataset named Multilingual Audio-Visual (MAV-Celeb) for exploring face-voice association in multilingual environments. This report provides the details of the challenge, dataset, baseline models, and task details for the FAME Challenge.","short_abstract":"The advancements of technology have led to the use of multimodal systems in various real-world applications. Among them, audio-visual systems are among the most widely used multimodal systems. In the recent years, associating face and voice of a person has gained attention due to the presence of unique correlation betw...","url_abs":"https://arxiv.org/abs/2508.04592","url_pdf":"https://arxiv.org/pdf/2508.04592v2","authors":"[\"Marta Moscati\",\"Ahmed Abdullah\",\"Muhammad Saad Saeed\",\"Shah Nawaz\",\"Rohan Kumar Das\",\"Muhammad Zaigham Zaheer\",\"Junaid Mir\",\"Muhammad Haroon Yousaf\",\"Khalid Malik\",\"Markus Schedl\"]","published":"2025-08-06T16:09:47Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[]","has_code":false}