{"ID":2888383,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.23544","arxiv_id":"2507.23544","title":"User Experience Estimation in Human-Robot Interaction Via Multi-Instance Learning of Multimodal Social Signals","abstract":"In recent years, the demand for social robots has grown, requiring them to adapt their behaviors based on users' states. Accurately assessing user experience (UX) in human-robot interaction (HRI) is crucial for achieving this adaptability. UX is a multi-faceted measure encompassing aspects such as sentiment and engagement, yet existing methods often focus on these individually. This study proposes a UX estimation method for HRI by leveraging multimodal social signals. We construct a UX dataset and develop a Transformer-based model that utilizes facial expressions and voice for estimation. Unlike conventional models that rely on momentary observations, our approach captures both short- and long-term interaction patterns using a multi-instance learning framework. This enables the model to capture temporal dynamics in UX, providing a more holistic representation. Experimental results demonstrate that our method outperforms third-party human evaluators in UX estimation.","short_abstract":"In recent years, the demand for social robots has grown, requiring them to adapt their behaviors based on users' states. Accurately assessing user experience (UX) in human-robot interaction (HRI) is crucial for achieving this adaptability. UX is a multi-faceted measure encompassing aspects such as sentiment and engagem...","url_abs":"https://arxiv.org/abs/2507.23544","url_pdf":"https://arxiv.org/pdf/2507.23544v1","authors":"[\"Ryo Miyoshi\",\"Yuki Okafuji\",\"Takuya Iwamoto\",\"Junya Nakanishi\",\"Jun Baba\"]","published":"2025-07-31T13:34:15Z","proceeding":"cs.RO","tasks":"[\"cs.RO\",\"cs.CV\",\"cs.HC\"]","methods":"[\"Transformer\"]","has_code":false}
