{"ID":2887624,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.01361","arxiv_id":"2508.01361","title":"VLH: Vision-Language-Haptics Foundation Model","abstract":"We present VLH, a novel Visual-Language-Haptic Foundation Model that unifies perception, language, and tactile feedback in aerial robotics and virtual reality. Unlike prior work that treats haptics as a secondary, reactive channel, VLH synthesizes mid-air force and vibration cues as a direct consequence of contextual visual understanding and natural language commands. Our platform comprises an 8-inch quadcopter equipped with dual inverse five-bar linkage arrays for localized haptic actuation, an egocentric VR camera, and an exocentric top-down view. Visual inputs and language instructions are processed by a fine-tuned OpenVLA backbone - adapted via LoRA on a bespoke dataset of 450 multimodal scenarios - to output a 7-dimensional action vector (Vx, Vy, Vz, Hx, Hy, Hz, Hv). INT8 quantization and a high-performance server ensure real-time operation at 4-5 Hz. In human-robot interaction experiments (90 flights), VLH achieved a 56.7% success rate for target acquisition (mean reach time 21.3 s, pose error 0.24 m) and 100% accuracy in texture discrimination. Generalization tests yielded 70.0% (visual), 54.4% (motion), 40.0% (physical), and 35.0% (semantic) performance on novel tasks. These results demonstrate VLH's ability to co-evolve haptic feedback with perceptual reasoning and intent, advancing expressive, immersive human-robot interactions.","short_abstract":"We present VLH, a novel Visual-Language-Haptic Foundation Model that unifies perception, language, and tactile feedback in aerial robotics and virtual reality. Unlike prior work that treats haptics as a secondary, reactive channel, VLH synthesizes mid-air force and vibration cues as a direct consequence of contextual v...","url_abs":"https://arxiv.org/abs/2508.01361","url_pdf":"https://arxiv.org/pdf/2508.01361v1","authors":"[\"Luis Francisco Moreno Fuentes\",\"Muhammad Haris Khan\",\"Miguel Altamirano Cabrera\",\"Valerii Serpiva\",\"Dmitri Iarchuk\",\"Yara Mahmoud\",\"Issatay Tokmurziyev\",\"Dzmitry Tsetserukou\"]","published":"2025-08-02T13:30:04Z","proceeding":"cs.RO","tasks":"[\"cs.RO\"]","methods":"[\"LoRA\"]","has_code":false}
