{"ID":2897926,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.04430","arxiv_id":"2507.04430","title":"\"Hi AirStar, Guide Me to the Badminton Court.\"","abstract":"Unmanned Aerial Vehicles, operating in environments with relatively few obstacles, offer high maneuverability and full three-dimensional mobility. This allows them to rapidly approach objects and perform a wide range of tasks often challenging for ground robots, making them ideal for exploration, inspection, aerial imaging, and everyday assistance. In this paper, we introduce AirStar, a UAV-centric embodied platform that turns a UAV into an intelligent aerial assistant: a large language model acts as the cognitive core for environmental understanding, contextual reasoning, and task planning. AirStar accepts natural interaction through voice commands and gestures, removing the need for a remote controller and significantly broadening its user base. It combines geospatial knowledge-driven long-distance navigation with contextual reasoning for fine-grained short-range control, resulting in an efficient and accurate vision-and-language navigation (VLN) capability.Furthermore, the system also offers built-in capabilities such as cross-modal question answering, intelligent filming, and target tracking. With a highly extensible framework, it supports seamless integration of new functionalities, paving the way toward a general-purpose, instruction-driven intelligent UAV agent. The supplementary PPT is available at \\href{https://buaa-colalab.github.io/airstar.github.io}{https://buaa-colalab.github.io/airstar.github.io}.","short_abstract":"Unmanned Aerial Vehicles, operating in environments with relatively few obstacles, offer high maneuverability and full three-dimensional mobility. This allows them to rapidly approach objects and perform a wide range of tasks often challenging for ground robots, making them ideal for exploration, inspection, aerial ima...","url_abs":"https://arxiv.org/abs/2507.04430","url_pdf":"https://arxiv.org/pdf/2507.04430v1","authors":"[\"Ziqin Wang\",\"Jinyu Chen\",\"Xiangyi Zheng\",\"Qinan Liao\",\"Linjiang Huang\",\"Si Liu\"]","published":"2025-07-06T15:29:07Z","proceeding":"cs.RO","tasks":"[\"cs.RO\"]","methods":"[\"Language Model\",\"LoRA\"]","has_code":false}
